Open Access Open Access  Restricted Access Subscription Access

A New Hybrid LSTM-RNN Deep Learning Based Racism, Xenomy, and Genderism Detection Model in Online Social Network


Affiliations
1 Department of Software Engineering, Firat University, Elazig-23000, Turkey
 

Hate speech, which is a problem that affects everyone in the world, is taking on new dimensions and becoming more violent every day. The majority of people’s interest in social media has grown in recent years, particularly in the United States. Twitter placed 5th in social media usage figures in 2022, with an average of 340 million users globally, and human control of social media has become unfeasible as a result of this expansion. As a result, certain platforms leveraging deep learning approaches have been created for machine translation, word tagging, and language understanding. Different strategies are used to develop models that divide texts into categories in this way. The goal of this research is to create an effective a new hybrid prediction model that can recognize racist, xenophobic, and sexist comments published in English on Twitter, a popular social media platform, and provide efficient and accurate findings. 7.48 percent of the data were classified as racist, genderist, and xenophobic in the used dataset. A new hybrid LSTM Neural Network and Recurrent Neural Network based model was developed in this study and compared with the most popular supervised intelligent classification models such as Logistic Regression, Support Vector Machines, Naive Bayes, Random Forest, and K-Nearest Neighbors. The results of these several models were thoroughly examined, and the LSTM Neural Network model was found to have the best performance, with an accuracy rate of 95.20 percent, a recall value of 48.94 percent, a precision of 60.95 percent, and an F1 Score of 51.32 percent. The percentage of test data was then modified, and the comparison was made by attempting to get various findings. With a larger dataset, these deep learning models are believed to produce substantially better outcomes.

Keywords

Artificial Intelligence, Deep Learning, Genderism, Racism, Xenophobia
User
Notifications
Font Size

  • B. B. Gupta, S. R. Sahoo, Online social networks security: principles, algorithm, applications, and perspectives (CRC Press, 2021) .
  • V. Lyding, E. Stemle, C. Borghetti, M. Brunello, S. Castagnoli, F. Dell’Orletta ve V. ... & Pirrelli, The PAISA Corpus of Italianweb Texts, 9th Web as Corpus Workshop (WaC-9), Gothenburg, Sweden, 2014.
  • S. De, et al. An introduction to data mining in social networks, Advanced Data Mining Tools and Methods for Social Computing. Academic Press, 2022. 1-25.
  • R. Z. Ul, S. Abbas, M. A. Khan, G. Mustafa, H. Fayyaz, M. Hanif & M. A. Saeed, M. A. Understanding the language of ISIS: An empirical approach to detect radical content on twitter using machine learning, 2021.
  • FifthTribe. (2015). How ISIS uses Twitter. Accessed: May, 10, 2022. [Online]. Available: https://www.kaggle.com/fifthtribe/how-isis-usestwitter
  • ActiveGalaxy, Kaggle. (2016). ISIS Related Dataset. Accessed: May, 10, 2022. [Online]. Available: https://www.kaggle.com/datasets/activegalaxy/isisrelated- tweets
  • FifthTribe, Kaggle. (2017). ISIS Religious Text. Accessed: May, 10, 2022. [Online]. Available: https://www.kaggle.com/datasets/fifthtribe/isisreligious- texts
  • W. Sharif, S. Mumtaz, Z. Shafiq, O. Riaz, T. Ali, M. Husnain & G. S. Choi, An empirical approach for extreme behavior identification through tweets using machine learning. Applied Sciences, 9(18), 2019, 3723.
  • T. Ruttig, Kunduz Madrassa Attack Al Jazeera, 2018, Accessed: May. 10, 2022. [Online], Available: https://www.aljazeera.com/opinions/2018/4/5/kundu z-madrassa-attack-losing-the-moral-high-ground. [10] Y. Chen, Y. Zhou, S. Zhu and H. Xu, Detecting offensive language in social media to protect adolescent online safety, Proceedings, 2012, 71-80.
  • M. Wiegand, J. Ruppenhofer, A. Schmidt and C. Greenberg, Inducing a Lexicon of Abusive Words – a Feature-Based Approach, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies., 2018.
  • G. Xiang, B. Fan, L. Wang, J. Hong and C. Rose, Detecting offensive tweets via topical feature discovery over a large scale twitter corpus, Proc. 21st ACM Int. Conf. Inf. Knowl. Manag.- CIKM’12, 2012.
  • H. Chen, S. McKeever, & S. J. Delany, Abusive Text Detection Using Neural Networks, In AICS, (December 2017) 258-260.
  • A. Alrehili, Automatic hate speech detection on social media: A brief survey, In 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA), IEEE, (2019, November), 1- 6.
  • C. Van Hee, E. Lefever, B. Verhoeven, J. Mennes, B. Desmet, G. De Pauw, ... & V. Hoste, Detection and fine-grained classification of cyberbullying events. In Proceedings of the international conference recent advances in natural language processing, (2015, September), 672-680.
  • M. Anzovino, E. Fersini & P. Rosso, Automatic identification and classification of misogynistic language on twitter, In International Conference on Applications of Natural Language to Information Systems, Springer, Cham, (2018, June), 57-64.
  • B. Ross, M. Rist, G. Carbonell, B. Cabrera, N. Kurowsky & M. Wojatzki, Measuring the reliability of hate speech annotations: The case of the european refugee crisis, arXiv preprint arXiv:1701.08118, 2017.
  • A. Schmidt, & M. Wiegand, A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, April 3, 2017, Valencia, Spain, Association for Computational Linguistics, (2019, January), 1-10.
  • B. Vidgen & L. Derczynski, Directions in abusive language training data, a systematic review: Garbage in, garbage out, Plos one, 15(12), e0243300, 2020.
  • Z. Waseem, T. Davidson, D. Warmsley & I. Weber, Understanding abuse: A typology of abusive language detection subtasks, arXiv preprint arXiv:1705.09899, 2017.
  • E. Wulczyn, N. Thain & L. Dixon, Ex machina: Personal attacks seen at scale, CoRR abs/1610.08914, 2016.
  • A. Alotaibi& M. H. A. Hasanat, Racism detection in Twitter using deep learning and text mining techniques for the Arabic language, In 2020 First International Conference of Smart Systems and Emerging Technologies (SMARTTEC), IEEE, (2020, November), 161-164.
  • E. Lee, F. Rustam, P. B. Washington, F. El Barakaz, W. Aljedaani & I. Ashraf, Racism Detection by Analyzing Differential Opinions Through Sentiment Analysis of Tweets Using Stacked Ensemble GCRNN Model. IEEE Access, 10, 2022, 9717-9728.
  • J. H. Park ve P. Fung, “One-step and Two-step Classification for Abusive Language Detection on Twitter”, AICS Conference, 2017.
  • O. Istaiteh, R. Al-Omoush & S. Tedmori, October). Racist and sexist hate speech detection: Literature review, In 2020 International Conference on Intelligent Data Science Technologies and Applications (IDSTA) IEEE, 2020, 95-99.
  • S. Frenda, B. Ghanem, M. Montes-y-Gómez & P. Rosso, Online hate speech against women: Automatic identification of misogyny and sexism on twitter, Journal of Intelligent & Fuzzy Systems, 36(5), 2019, 4743-4752.
  • E. Fersini, D. Nozza & P. Rosso, Overview of the evalita 2018 task on automatic misogyny identification (ami), EVALITA Evaluation of NLP and Speech Tools for Italian, 12, 2018, 59.
  • E. Fersini, P. Rosso M. & Anzovino, Overview of the Task on Automatic Misogyny Identification at IberEval 2018, Ibereval@ sepln, 2150, 2018, 214- 228.
  • J. Andreas, E. Choi & A. Lazaridou, Proceedings of the naacl student research workshop. In Proceedings of the NAACL Student Research Workshop, (2016, June).
  • P. Saha, B. Mathew, P. Goyal & A. Mukherjee, Hateminers: Detecting hate speech against women. arXiv preprint arXiv:1812.06700, 2018.
  • I. Kwok & Y. Wang, Locate the hate: Detecting tweets against blacks. In Twenty-seventh AAAI conference on artificial intelligence, (2013, June).
  • L. Hickman, S. Thapa, L. Tay, M. Cao & P. Srinivasan, Text preprocessing for text mining in organizational research: Review and recommendations, Organizational Research Methods, 25(1), 2022, 114-146.
  • Z. Waseem and D. Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, Proc. NAACL Student Res. Work, 2016, 88-93.
  • "Kaggle," 2020. [Online]. Available: https://www.kaggle.com/datasets/mrmorj/hatespeech- and-offensive-language-dataset.

Abstract Views: 162

PDF Views: 1




  • A New Hybrid LSTM-RNN Deep Learning Based Racism, Xenomy, and Genderism Detection Model in Online Social Network

Abstract Views: 162  |  PDF Views: 1

Authors

Sule Kaya
Department of Software Engineering, Firat University, Elazig-23000, Turkey
Bilal Alatas
Department of Software Engineering, Firat University, Elazig-23000, Turkey

Abstract


Hate speech, which is a problem that affects everyone in the world, is taking on new dimensions and becoming more violent every day. The majority of people’s interest in social media has grown in recent years, particularly in the United States. Twitter placed 5th in social media usage figures in 2022, with an average of 340 million users globally, and human control of social media has become unfeasible as a result of this expansion. As a result, certain platforms leveraging deep learning approaches have been created for machine translation, word tagging, and language understanding. Different strategies are used to develop models that divide texts into categories in this way. The goal of this research is to create an effective a new hybrid prediction model that can recognize racist, xenophobic, and sexist comments published in English on Twitter, a popular social media platform, and provide efficient and accurate findings. 7.48 percent of the data were classified as racist, genderist, and xenophobic in the used dataset. A new hybrid LSTM Neural Network and Recurrent Neural Network based model was developed in this study and compared with the most popular supervised intelligent classification models such as Logistic Regression, Support Vector Machines, Naive Bayes, Random Forest, and K-Nearest Neighbors. The results of these several models were thoroughly examined, and the LSTM Neural Network model was found to have the best performance, with an accuracy rate of 95.20 percent, a recall value of 48.94 percent, a precision of 60.95 percent, and an F1 Score of 51.32 percent. The percentage of test data was then modified, and the comparison was made by attempting to get various findings. With a larger dataset, these deep learning models are believed to produce substantially better outcomes.

Keywords


Artificial Intelligence, Deep Learning, Genderism, Racism, Xenophobia

References