Open Access Open Access  Restricted Access Subscription Access

Sentiment Classification based on Linguistic Patterns in Citation Context


Affiliations
1 College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
 

When citations occur, authors tend to express their emotions implicitly, which makes it difficult to identify the sentiment of citation context. However, authors will still use specific linguistic patterns to express their emotions in citation context. This article explores the linguistic patterns of emotional expression in citation context, and on this basis recognizes the sentimental polarity of citation context. Conditional random fields (CRF) model is introduced to annotate the logical relationship between syntactic structure and vocabularies in linguistic patterns. By analysing the effect of the generated CRF templates in classifying the subjective/ objective sentences and the positive/negative emotional polarity in citation context, the role of linguistic patterns in classifying the citation sentiment is discussed. Experimental results show that the CRF model based on linguistic patterns is superior to the commonly used support vector machine (SVM) model in both subjective/objective and emotional polarity classification tasks. In the SVM model, the contextual information of citation context is considered by introducing one deep learning model of Word2vec. It shows that extracting linguistic patterns from the citation context helps reflect the way in which an author organizes his/her language in expressing his/her emotions. Extracting these linguistic patterns helps improve the performance of sentiment classification of citation context.

Keywords

Citation Context, Conditional Random Fields, Linguistic Patterns, Sentiment Classification, Support Vector Machine.
User
Notifications
Font Size

  • Garfield, E., Journal impact factor: a brief review. Canadian Medical Association or its licensors. Can. Med. Assoc. J., 1999, 161(8), 979–980.
  • Seglen, P. O., Why the impact factor of journals should not be used for evaluating research. BMJ, 1997, 314(7079), 497.
  • Fu, H. Z. and Ho, Y. S., Collaborative characteristics and networks of national, institutional and individual contributors using highly cited articles in environmental engineering in Science Citation Index Expanded. Curr. Sci., 2018, 115(3), 410–421.
  • Nicholson, J. M. and Ioannidis, J. P. A., Research grants: conform and be funded. Nature, 2012, 492(7427), 34–36.
  • Saggion, H. and Ronzano, F., Scholarly data mining: making sense of scientific literature. In ACM/IEEE Joint Conference on Digital Libraries, Toronto, Canada, 2017, pp. 346–347.
  • Jha, R. et al., NLP-driven citation analysis for scientometrics. Nat. Lang. Eng., 2017, 23(1), 93–130.
  • Garfield, E., Is citation analysis a legitimate evaluation tool? Scientometrics, 1979, 1(4), 359–375.
  • Hernández-Alvarez, M. and Gomez, J. M., Survey about citation context analysis: tasks, techniques, and resources. Nat. Lang. Eng., 2016, 22(3), 327–349.
  • Marder, E., Kettenmann, H. and Grillner, S., Impacting our young. Proc. Natl. Acad. Sci. USA, 2010, 107(50), 21233.
  • Ghosh, S., Das, D. and Chakraborty, T., Determining sentiment in citation text and analyzing its impact on the proposed ranking index. In International Conference on Intelligent Text Processing and Computational Linguistics, Springer, Cham, Switzerland, 2016, pp. 292–306.
  • Sendhilkumar, S., Elakkiya, E. and Mahalakshmi, G. S., Citation semantic based approaches to identify article quality. In Proceedings of International Conference ICCSEA, Delhi, India, 2013, pp. 411–420.
  • Abu-Jbara, A., Ezra, J. and Radev, D., Purpose and polarity of citation: towards NLP-based bibliometrics. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, 2013, pp. 596–606.
  • Teufel, S., Siddharthan, A. and Tidhar, D., Automatic classification of citation function. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Sydney, Australia, 2006, pp. 103–110.
  • Goodarzi, M., Mahmoudi, M. T. and Zamani, R., A framework for sentiment analysis on schema-based research context via lexica analysis. In IEEE 7th International Symposium on Telecommunications, 2014, pp. 405–411.
  • Parthasarathy, G. and Tomar, D. C., Sentiment analyzer: analysis of journal citations from citation databases. In IEEE 5th International Conference – Confluence the Next Generation Information Technology Summit (Confluence), 2014, pp. 923–928.
  • Kim, I. C. and Thoma, G. R., Automated classification of author's sentiment in citation using machine learning techniques: a preliminary study. In IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, 2015, pp. 1–7.
  • Sula, C. A. and Miller, M., Citations, context, and humanistic discourse: Toward automatic extraction and classification. Lit. Linguist. Comput., 2014, 29(3), 452–464.
  • Athar, A., Sentiment analysis of citations using sentence structurebased features. In Proceedings of the ACL 2011 Student Session. Association for Computational Linguistics, Portland, Oregon, 2011, pp. 81–87.
  • Xu, J. et al., Citation sentiment analysis in clinical trial papers. In AMIA Annual Symposium Proceedings. American Medical Informatics Association, San Francisco, California, 2015, pp. 1334–1341.
  • Athar, A. and Teufel, S., Context-enhanced citation sentiment detection. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Montreal, Canada, 2012, pp. 597–601.
  • Bertin, M. et al., The linguistic patterns and rhetorical structure of citation context: an approach using n-grams. Scientometrics, 2016, 109(3), 1417–1434.
  • Ikram, M. T. and Afzal, M. T., Aspect based citation sentiment analysis using linguistic patterns for better comprehension of scientific knowledge. Scientometrics, 2019, 119(1), 73–95.
  • Lafferty, J., McCallum, A. and Pereira, F. C. N., Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning, Williamstown, MA, USA, 2001, pp. 282–289.
  • Li, X. and Li, J., Sentiment classification and strength analysis method based on three-layered conditional random fields. Appl. Res. Comput., 2017, 34(4), 986–990.
  • Shi, L. et al., Application of CRF and SVM based semi-supervised learning for semantic labeling of environments. In IEEE 12th International Conference on Control Automation Robotics & Vision, Guangzhou, China, 2012, pp. 835–840.
  • Horn, L. R., A natural history of negation. J. Linguist., 1989, 56(3), 426–433.
  • Morante, R., Schrauwen, S. and Daelemans, W., Annotation of negation cues and their scope: Guidelines v1. Computational linguistics and psycholinguistics technical report series, University of Antwerp, Antwerp, CTRS-003, 2011.
  • Hernandez-Alvarez, M., Soriano, J. M. G. and Martínez-Barco, P., Citation function, polarity and influence classification. Nat. Lang. Eng., 2017, 23(4), 561–588.
  • Hernández-Alvarez, M. and Gómez, J. M., Citation impact categorization: for scientific literature. In IEEE 18th International Conference on Computational Science and Engineering, Porto, Portugal, 2015, pp. 307–313.
  • Hernández Álvarez, M., Gómez, J. M. and Martínez-Barco, P., Annotated corpus for citation context analysis. Latin-Am. J. Comput., 2017, 3(1), 35–41.
  • Ma, Z., Nam, J. and Weihe, K., Improve sentiment analysis of citations with author modelling. In Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, San Diego, California, 2016, pp. 122–127.
  • Mikolov, T. et al., Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst., 2013, 2, 3111–3119.
  • Yan, D., Hua, E. and Hu, B., An improved single-pass algorithm for Chinese microblog topic detection and tracking. In IEEE International Congress on Big Data, Washington DC, 2016, pp. 251– 258.
  • Li, C. et al., LDA meets Word2Vec: a novel model for academic abstract clustering. In Companion of the Web Conference 2018, International World Wide Web Conferences Steering Committee, Lyon, France, 2018, pp. 1699–1706.
  • Chengzhang, X. and Dan, L., Chinese text summarization algorithm based on Word2vec. J. Phys.: Conf. Ser., 2018, 976(1), 012006.
  • Fauzi, M. A., Word2Vec model for sentiment analysis of product reviews in Indonesian language. Int. J. Electr. Comput. Eng., 2019, 9(1), 525–530.
  • Yao, Y. et al., Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model. Int. J. Geogr. Inf. Sci., 2017, 31, 825–848.

Abstract Views: 342

PDF Views: 116




  • Sentiment Classification based on Linguistic Patterns in Citation Context

Abstract Views: 342  |  PDF Views: 116

Authors

Mingyang Wang
College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
Dongtian Leng
College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
Jinjin Ren
College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
Yiming Zeng
College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
Guangsheng Chen
College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China

Abstract


When citations occur, authors tend to express their emotions implicitly, which makes it difficult to identify the sentiment of citation context. However, authors will still use specific linguistic patterns to express their emotions in citation context. This article explores the linguistic patterns of emotional expression in citation context, and on this basis recognizes the sentimental polarity of citation context. Conditional random fields (CRF) model is introduced to annotate the logical relationship between syntactic structure and vocabularies in linguistic patterns. By analysing the effect of the generated CRF templates in classifying the subjective/ objective sentences and the positive/negative emotional polarity in citation context, the role of linguistic patterns in classifying the citation sentiment is discussed. Experimental results show that the CRF model based on linguistic patterns is superior to the commonly used support vector machine (SVM) model in both subjective/objective and emotional polarity classification tasks. In the SVM model, the contextual information of citation context is considered by introducing one deep learning model of Word2vec. It shows that extracting linguistic patterns from the citation context helps reflect the way in which an author organizes his/her language in expressing his/her emotions. Extracting these linguistic patterns helps improve the performance of sentiment classification of citation context.

Keywords


Citation Context, Conditional Random Fields, Linguistic Patterns, Sentiment Classification, Support Vector Machine.

References





DOI: https://doi.org/10.18520/cs%2Fv117%2Fi4%2F606-616