Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Context-Based Feature Extraction Technique – LSI vs LDA


Affiliations
1 Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
2 Department of Computer Science and Engineering, KLN Information Technology, Madurai, Tamil Nadu, India
     

   Subscribe/Renew Journal


Internet has enormous amount of documents and they need to be annotated for further processing. Customer reviews or feedback on product is mostly done by using text mining or text analytics techniques. Feature extraction plays the vital role in text analytics methodology by which the most relevant features are extracted and used for text processing. This research article focuses on the use of Latent Dirichlet Allocation (LDA) as the feature extraction technique and it is compared with the prominent technique Latent Semantic Indexing (LSI).


Keywords

Text Analytics, Feature Extraction, Latent Semantic Indexing (LSI), Latent Dirichlet Allocation (LDA), Document Categorization.
User
Subscription Login to verify subscription
Notifications
Font Size

  • Aswani Kumar, & Srinivas, S. (2009). On the Performance of Latent Semantic Indexing-based Information Retrieval. Journal of Computing and Information Technology, 17(3), 259–264.
  • Blei, D., Ng, A., & Jordan, M. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3.
  • Chawla, K., Ramteke, A., Bhattacharyya, P.: “IITB-Sentiment-Analysts: Participation in Sentiment Analysis in Twitter SemEval 2013 Task”, Seventh International Workshop on Semantic Evaluation (2013), 495-500.
  • David Binkley, Daniel Heinz, Dawn Lawrie & Justin Overfelt. (2014). Understanding LDA in Source Code Analysis.Proceedings of 22nd International Conference on Programme Comprehension ICPC ’14, Hyderabad, India.
  • Guo, H., Zhu, H., Guo, Z., & Su, Z. (2009). Product feature categorization with multilevel latent semantic association, in Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China.
  • Harb, A., Plantie, M., Dray, G., Roche, M., Trousset, F., Poncelet, P.: “Web Opinion Mining: How to extract opinions from blogs?”, CSTST ’08 International Conference on Soft Computing as Transdisciplinary Science and Technology, (2008), 211-217.
  • Liu, J.; Cao, Y.; Lin, C. Y.; Huang, Y.; and Zhou, M. 2007. Low-Quality Product Review Detection in Opinion Summarization. InProceedings of the 2007 Joint Conferenceon Empirical Methods in Natural Language Processing andComputational Natural Language Learning.
  • Liu, B (2012), Sentiment Analysis and Opinion Mining, Morgan & Claypool Publishers, San Rafael, California, USA.
  • Manning, C. D., Raghavan, P., Schūtze, & Hinrich. (2009). An Introduction to Information Retrieval. Cambridge, England: Cambridge University Press.
  • Meena, A., Prabhakar, T.V.: “Sentence Level Sentiment Analysis in the Presenceof Conjuncts Using Linguistic Analysis”, 29th European Conference on IR Research ECIR 2007, LNCS 4425 (2007), 573–580.
  • Pang, B., Lee, L.: “Thumps up? Sentiment Classification using Machine Learning techniques”, Proceedings of Empirical Methods in Natural Language Processing (2002), 79-86.
  • Qiu, G., Liu, B., Bu, J., Chen, C.: “Expanding Domain sentiment lexicon through double propagation”, Computational Linguistics, 37, 1 (2008), 9-27.
  • Saif, H., He, Y., & Alani, H. (2012). Semantic sentiment analysis of twitter. In the 11th International Semantic Web Conference (ISWC 2012), Boston, MA, USA.
  • Somprasertsri, G., Lalitrojwong, P.: “Mining Feature-Opinion in Online Customer Reviews for Opinion Summarization”, Journal of Universal Computer Science, 16, 6 (2010), 938-955.
  • Wei Wei, & John Atla Gulla. (2010). Sentiment Learning on Product Reviews via Sentiment Ontology Tree. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics Sweden, pp. 404–413.

Abstract Views: 321

PDF Views: 1




  • Context-Based Feature Extraction Technique – LSI vs LDA

Abstract Views: 321  |  PDF Views: 1

Authors

A. M. Abirami
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
A. Askarunisa
Department of Computer Science and Engineering, KLN Information Technology, Madurai, Tamil Nadu, India
T. S. B. Akshara
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
G. Prasannashree
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
K. Priyanga
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
K. Sarika
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India

Abstract


Internet has enormous amount of documents and they need to be annotated for further processing. Customer reviews or feedback on product is mostly done by using text mining or text analytics techniques. Feature extraction plays the vital role in text analytics methodology by which the most relevant features are extracted and used for text processing. This research article focuses on the use of Latent Dirichlet Allocation (LDA) as the feature extraction technique and it is compared with the prominent technique Latent Semantic Indexing (LSI).


Keywords


Text Analytics, Feature Extraction, Latent Semantic Indexing (LSI), Latent Dirichlet Allocation (LDA), Document Categorization.

References