Open Access Open Access  Restricted Access Subscription Access

Feature Selection using Random Forest Method for Sentiment Analysis


Affiliations
1 Department of CSE, Vel Tech University, Chennai - 600062, Tamil Nadu, India
 

Background/Objectives: Online review has become important decision support system for the customers to decide on the subscription or purchse. This paper is aiming to suggest a method that improves the accuracy of the classifier. Methods/Statistical analysis: Feature selection for sentiment analysis using decision forest method and Principal Component Analysis (PCA) is used for the feature reduction. The proposed method is evaluated using twitter data set. Findings: It is proved, that the proposed decision forest based feature extraction improves the precision of the classifiers in the range of 12.49% to 62.5% when compared to PCA and by 49.5% to 62.5% when compared to decision tree based feature selection. Application/Improvements: This method is applicable to product reviews, emotion detection, Knowledge transformation, and predictive analytics.

Keywords

Inverse Document Frequency (IDF), Learning Vector Quantization (LVQ), Opinion Mining, Principal Component Analysis (PCA), Sentiment Analysis, Twitter
User

Abstract Views: 150

PDF Views: 0




  • Feature Selection using Random Forest Method for Sentiment Analysis

Abstract Views: 150  |  PDF Views: 0

Authors

Jeevanandam Jotheeswaran
Department of CSE, Vel Tech University, Chennai - 600062, Tamil Nadu, India
S. Koteeswaran
Department of CSE, Vel Tech University, Chennai - 600062, Tamil Nadu, India

Abstract


Background/Objectives: Online review has become important decision support system for the customers to decide on the subscription or purchse. This paper is aiming to suggest a method that improves the accuracy of the classifier. Methods/Statistical analysis: Feature selection for sentiment analysis using decision forest method and Principal Component Analysis (PCA) is used for the feature reduction. The proposed method is evaluated using twitter data set. Findings: It is proved, that the proposed decision forest based feature extraction improves the precision of the classifiers in the range of 12.49% to 62.5% when compared to PCA and by 49.5% to 62.5% when compared to decision tree based feature selection. Application/Improvements: This method is applicable to product reviews, emotion detection, Knowledge transformation, and predictive analytics.

Keywords


Inverse Document Frequency (IDF), Learning Vector Quantization (LVQ), Opinion Mining, Principal Component Analysis (PCA), Sentiment Analysis, Twitter



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i3%2F130262