Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Adaptive Word Embedding to Reduce the Dimensionality of the Document to Vector Representation


Affiliations
1 Assistant Professor, Department of Information Technology, M.Kumarasamy College of Engineering, Karur, Tamil Nadu, India
2 UG Scholar, Department of Information Technology, M.Kumarasamy College of Engineering, Karur, Tamil Nadu, India
     

   Subscribe/Renew Journal


Sentiment Analysis is a methodology of detecting the emotions from the text. It is an application of Natural Language Processing (NLP) methodology. The NLP enables us to know the common day to day language of the people. This will helps to decipher the sentiments of the users and hence explain liking and disliking of the people. The traditional bag-of-words models lack the accuracy of sentiment classifications. The intention of this project is to improve the accuracy of the sentiment classification by employing the concept of dimensionality reduction. Reducing the dimensionality of a large document helps to reduce the computational cost and increase efficiency. Word embedding methods capture the context of a word in a document which helps to reduce the dimensionality of text data. Vector representation of the words using a technique like Word2Vector proves to be very effective in interpreting the meaning and hence the sentiments. The words in the document will be converted into vectors. Each word is assigned a unique value (vectors) such that these vectors represent its context, meaning, and semantics. The resulting word vectors are wont to train machine learning algorithms within the sort of classifiers for sentiment classification. We use the Machine Learning classifier Naive Bayes to analyze the sentiment from the given pre-processed dataset (word vectors). Our experiments on real-world datasets show the improvement in the accuracy of sentiment classification using the word embedding techniques.

Keywords

Dimensionality Reduction, Sentiment Analysis, Vector Representation, Word Embedding
User
Subscription Login to verify subscription
Notifications
Font Size

  • B. Pang, and L. Lee, “Opinion mining and sentiment analysis,” Foundations and Trends in Information Retrieval, vol. 2, no. 1-2, pp. 1-135, 2008.
  • A. Pak, and P. Paroubek, “Twitter as a corpus for sentiment analysis and opinion mining,” Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, Valletta, Malta, May 17-23, 2010.
  • J. Khimar, and M. Kinikar, “Machine learning algorithms for opinion mining and sentiment classification,” International Journal of Scientific and Research Publications, vol. 3, no. 6, pp. 1-6, Jun. 2013.
  • R. Mehra, M. K. Bedi, G. Singh, R. Arora, T. Bala, and S. Saxena, “Sentimental analysis using fuzzy and Naïve Bayes,” 2017 International Conference on Computing Methodologies and Communication (ICCMC), Jul. 2017.
  • B. Liu, E. Blasch, Y. Chen, D. Shen, and G. Chen, “Scalable sentiment classification for big data analysis using Naïve Bayes classifier,” 2013 IEEE International Conference on Big Data, Silicon Valley, CA, USA, Oct. 6-9, 2013.
  • S. Rana, and A. Singh, “Comparative analysis of sentiment orientation using SVM and Naïve Bayes techniques,” 2016 2nd International Conference on Next Generation Computing Technologies (NGCT), Dehradun, India, Oct. 14-16, 2016.
  • A. Goel, J. Gautam, and S. Kumar, “Real-time sentiment analysis of tweets using Naïve Bayes,” 2016 2nd International Conference on Next Generation Computing Technologies (NGCT), Dehradun, India, Oct. 14-16, 2016.
  • H. Parveen, and S. Pandey “Sentiment analysis on twitter data-set using Naïve Bayes algorithm,” 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Bangalore, India, Jul. 21-23, 2016.
  • V. Vryniotis, “Machine learning tutorial: The multinomial logistic regression (Softmax Regression),” 2013.
  • N. Zainuddin, and A. Selamat, “Sentiment analysis using support vector machine,” 2014 International Conference on Computer, Communications, and Control Technology (I4CT), Langkawi, Malaysia, Sept. 2-4, 2014.
  • T. Gunasekhar, and K. T. Rao, “EBCM: Single encryption, multiple decryptions,” International Journal of Applied Engineering Research, vol. 9, no. 19, pp. 5885-5893, 2014.
  • K. T. Rao, P. S. Kiran, and L. S. S. Reddy, “High-level architecture to provide cloud services using green data center,” Advances in Wireless and Mobile Communications (AWMC), 2014.
  • K. T. Rao, P. S. Kiran, D. L. S. S. Reddy, V. K. Reddy, and B. T. Rao, “Genetic algorithm for energy placement of virtual machines in cloud environment,” Proceedings of the IEEE International Conference on Future Information Technology, 2012.
  • W. P. Ramadhan, S. T. M. T. Astri Novianty, and S. T. M. T. Casi Setianingsih, “Sentiment analysis using multinomial logistic regression,” 2017 International Conference on Control, Electronics, Renewable Energy and Communications (ICCREC), Yogyakarta, Indonesia, Sept. 26-28, 2017.
  • V. A. Kharde, and S. Sonawane, “Sentiment analysis of twitter data: A survey of techniques,” International Journal of Computer Applications, vol. 139, no. 11, pp. 5-15, Apr. 2016.
  • P. V. V. Kishore, S. R. C. Kishore, and M. V. D. Prasad, “Conglomeration of hand shapes and texture information for recognizing gestures of Indian sign language using feed forward neural networks,” International Journal of Engineering and Technology, vol. 5, no. 5, pp. 3742-3756, 2013.

Abstract Views: 162

PDF Views: 0




  • Adaptive Word Embedding to Reduce the Dimensionality of the Document to Vector Representation

Abstract Views: 162  |  PDF Views: 0

Authors

M. Gunasekar
Assistant Professor, Department of Information Technology, M.Kumarasamy College of Engineering, Karur, Tamil Nadu, India
M. Dhayalan
UG Scholar, Department of Information Technology, M.Kumarasamy College of Engineering, Karur, Tamil Nadu, India
N. Pradeep
UG Scholar, Department of Information Technology, M.Kumarasamy College of Engineering, Karur, Tamil Nadu, India
S. Sakthivel
UG Scholar, Department of Information Technology, M.Kumarasamy College of Engineering, Karur, Tamil Nadu, India
R. Venkatesh
UG Scholar, Department of Information Technology, M.Kumarasamy College of Engineering, Karur, Tamil Nadu, India

Abstract


Sentiment Analysis is a methodology of detecting the emotions from the text. It is an application of Natural Language Processing (NLP) methodology. The NLP enables us to know the common day to day language of the people. This will helps to decipher the sentiments of the users and hence explain liking and disliking of the people. The traditional bag-of-words models lack the accuracy of sentiment classifications. The intention of this project is to improve the accuracy of the sentiment classification by employing the concept of dimensionality reduction. Reducing the dimensionality of a large document helps to reduce the computational cost and increase efficiency. Word embedding methods capture the context of a word in a document which helps to reduce the dimensionality of text data. Vector representation of the words using a technique like Word2Vector proves to be very effective in interpreting the meaning and hence the sentiments. The words in the document will be converted into vectors. Each word is assigned a unique value (vectors) such that these vectors represent its context, meaning, and semantics. The resulting word vectors are wont to train machine learning algorithms within the sort of classifiers for sentiment classification. We use the Machine Learning classifier Naive Bayes to analyze the sentiment from the given pre-processed dataset (word vectors). Our experiments on real-world datasets show the improvement in the accuracy of sentiment classification using the word embedding techniques.

Keywords


Dimensionality Reduction, Sentiment Analysis, Vector Representation, Word Embedding

References