Open Access Open Access  Restricted Access Subscription Access

Stance and Sentiment Analysis of Health-related Tweets with Data Augmentation


Affiliations
1 Department of Computer Engineering, Graduate School of Natural and Applied Sciences, Gazi University, Turkey
2 Department of Management Information Systems, Faculty of Applied Sciences, Gazi University, Turkey

Common social media platforms like Twitter are important as up-to-date information sources for several monitoring purposes, including instant public health monitoring. In this sense, large volumes of health-related social media posts (such as tweets on the COVID-19 pandemic) have been produced recently, and are ready to be analyzed to facilitate health-related decision making. In this paper, joint stance detection and sentiment analysis on tweets about the COVID-19 vaccinationwas performed, in order to showcase the contribution of different machine learning and deep learning techniques equipped with data augmentation. Training and test tweet datasets are compiled and annotated for both stance and sentiment analysis and next, the training dataset is extended using an automatic data augmentation technique to increase its size. Experiments with different classifiers are performed for automated stance and sentiment analyses, using this extended dataset during training. The data augmentation technique adopted in this study to cope with data scarcity problems in machine learning research leads to better performance rates in this domain of health-related social media analysis. Comparative evaluations are also performed using a publicly-available sentiment analysis tool. The extended dataset and the test dataset, along with the approaches, and evaluation results are significant for health informatics, because, they facilitate joint estimation of instant community stance and sentiment towards COVID-19 vaccination which has been an important public health concern. Therefore, public health decision-makers can extensively and readily benefit from the findings and resources of the current study.

Keywords

Deep learning, Health informatics, Machine learning, Natural language processing, Public health monitoring
User
Notifications
Font Size

Abstract Views: 14




  • Stance and Sentiment Analysis of Health-related Tweets with Data Augmentation

Abstract Views: 14  | 

Authors

Doğan Küçük
Department of Computer Engineering, Graduate School of Natural and Applied Sciences, Gazi University, Turkey
Nursal Arıcı
Department of Management Information Systems, Faculty of Applied Sciences, Gazi University, Turkey

Abstract


Common social media platforms like Twitter are important as up-to-date information sources for several monitoring purposes, including instant public health monitoring. In this sense, large volumes of health-related social media posts (such as tweets on the COVID-19 pandemic) have been produced recently, and are ready to be analyzed to facilitate health-related decision making. In this paper, joint stance detection and sentiment analysis on tweets about the COVID-19 vaccinationwas performed, in order to showcase the contribution of different machine learning and deep learning techniques equipped with data augmentation. Training and test tweet datasets are compiled and annotated for both stance and sentiment analysis and next, the training dataset is extended using an automatic data augmentation technique to increase its size. Experiments with different classifiers are performed for automated stance and sentiment analyses, using this extended dataset during training. The data augmentation technique adopted in this study to cope with data scarcity problems in machine learning research leads to better performance rates in this domain of health-related social media analysis. Comparative evaluations are also performed using a publicly-available sentiment analysis tool. The extended dataset and the test dataset, along with the approaches, and evaluation results are significant for health informatics, because, they facilitate joint estimation of instant community stance and sentiment towards COVID-19 vaccination which has been an important public health concern. Therefore, public health decision-makers can extensively and readily benefit from the findings and resources of the current study.

Keywords


Deep learning, Health informatics, Machine learning, Natural language processing, Public health monitoring