Open Access Open Access  Restricted Access Subscription Access

A Survey on Methods for Detecting Cyberbullying in Multilingual Documents


Affiliations
1 1Department of Computer Science and Engineering, LBS Institute of Technology for Women, Thiruvananthapuram, Kerala, India
2 Department of Computer Science and Engineering, LBS Institute of Technology for Women, Trivandrum, Kerala, India
 

Digital technologies are now swallowing the world. People irrespective of the age and gender are influenced by the colourful wings that they provide. Teenagers are the main victims of this digital era. They become addicted to games and the virtual world more quickly than any other age group. Their age is so critical that they are very much sensitive. There is a natural tendency among teenagers to do things so as to catch the attention of others. Sometimes this paves way to bully or harass or embarrass others on the internet or other digital spaces such as social media sites and that causes a negative impact on those who are being targeted then there arises the threat of cyberbullying. Social media users are not merely sticking on English language in their posts or comments but the usage of multilingual code mixing or even code switching is very much prevalent. Surveys have been done among people of different ages in many countries and have demonstrated various consequences of cyberbullying victimisation that lead to change in behaviour and increased anxiety. Researchers identified the necessity of computerbased solutions for determining, preventing, mitigating ow even stopping cyberbullying. This paper is a survey of various computer-based techniques specifically concentrating on machine learning, deep learning and natural language processing that targets to detect cyberbullying in online media.

Keywords

Bullying, Online Media, Code-Mixed Data, Hate Speech, Offensive Speech, Machine Learning, Deep Learning, Natural Language Processing.
User
Notifications
Font Size

  • Patchin, J.W. and Hinduja, S., 2022. Cyberbullying among tweens in the United States: prevalence, impact, and helping behaviors. The Journal of Early Adolescence, 42(3), pp.414-430.
  • Hinduja, S. and Patchin, J.W., 2015. Bullying beyond the schoolyard: preventing and responding to cyberbullying. Thousand Oak.
  • Sreelakshmi, K., Premjith, B. et al., 2020. Detection of hate speech text in Hindi-English code-mixed data. Procedia Computer Science, 171, pp.737-744.
  • Beatty-Martínez, A.L. et al., 2020. Codeswitching: A bilingual toolkit for opportunistic speech planning. Frontiers in Psychology, p.1699.Oriola, O. and Kotze, E (2020) Evaluating machine learning techniques for detecting offensive and hate speech in South African tweets IEEE Access 8, 21496-21509
  • Oriola, O. and Kotzé, E., 2020. Evaluating machine learning techniques for detecting offensive and hate speech in South African tweets. IEEE Access, 8, pp.21496- 21509.
  • Akhter, M.P., Jiangbin, Z., et al., 2020. Automatic detection of offensive language for urdu and roman urdu. IEEE Access, 8, pp.91213-91226.
  • Alhawarat, M. and Aseeri, A.O., 2020. A superior Arabic text categorization deep model (SATCDM). IEEE Access, 8, pp.24653-24661.
  • Varma, P.D., Vinod, P et al., 2022, June. Hate Speech detection in English and Malayalam Code-Mixed Text using BERT embedding. In 2022 International Conference on Computing, Communication, Security and Intelligent Systems (IC3SIS) (pp. 1- 6). IEEE.
  • Kalaivani, A. and Thenmozhi, D., 2020. Multilingual Sentiment Analysis in Tamil, Malayalam, and Kannada code-mixed social media posts using MBERT. FIRE (Working Notes).
  • Mathur, P., Sawhney, R. et al. 2018, October. Did you offend me? classification of offensive tweets in hinglish language. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2) (pp. 138-148).
  • Chakravarthi, B.R., Jose, N et al., 2020. A sentiment analysis dataset for codemixed Malayalam-English. arXiv preprint arXiv:2006.00210.
  • Chakravarthi, B.R et al., 2021. Findings of the sentiment analysis of dravidian languages in code-mixed text. arXiv preprint arXiv:2111.09811.
  • Babu, Y.P. and Eswari, R., 2021. Sentiment Analysis on Dravidian Code-Mixed YouTube Comments using Paraphrase XLM-RoBERTa Model. Working Notes of FIRE.
  • Emmery, C et al., 2021. Current limitations in cyberbullying detection: On evaluation criteria, reproducibility, and data scarcity. Language Resources and Evaluation, 55(3), pp.597-633.
  • Zhou, Y., Yang, Y. et al., 2020. Deep learning-based fusion approach for hate speech detection. IEEE Access, 8, pp.128923-128929.
  • Kumaresan, K. and Vidanage, K., 2019, October. Hatesense: Tackling ambiguity in hate speech detection. In 2019 National Information Technology Conference (NITC) (pp. 20-26). IEEE.
  • Vashistha, N. and Zubiaga, A., 2020. Online multilingual hate speech detection: experimenting with Hindi and English social media. Information, 12(1), p.5.
  • Liyanage, O. and Jayakumar, K., 2021, December. Hate Speech Detection in Sinhala-English Code-Mixed Language. In 2021 21st International Conference on Advances in ICT for Emerging Regions (ICter) (pp. 225-230). IEEE.
  • Chen, Y., Zhou et al., 2012, September. Detecting offensive language in social media to protect adolescent online safety. In 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing (pp. 71-80). IEEE.
  • Al-Makhadmeh, Z. and Tolba, A., 2020. Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Computing, 102(2), pp.501-522.
  • Xu, J.M., Jun, K.S. et al., 2012, June. Learning from bullying traces in social media. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 656-666).
  • Bohra, A., et al., 2018, June. A dataset of Hindi-English code-mixed social media text for hate speech detection. In Proceedings of the second workshop on computational modeling of people’s opinions, personality, and emotions in social media (pp. 36- 41).
  • Mandl, T., Modha, S. et al., 2021. Overview of the hasoc subtrack at fire 2021: Hate speech and offensive content identification in english and indo-aryan languages. arXiv preprint arXiv:2112.09301.Pieschl, Stephanie, Kourteva, Penka, and Stauf, Leonie. ‘Challenges in the Evaluation of Cyberbullying Prevention – Insights from Two Case Studies’. 1 Jan. 2017 : 45 – 54.
  • Pawar, R. and Raje, R.R., 2019, May. Multilingual cyberbullying detection system. In 2019 IEEE international conference on electro information technology (EIT) (pp. 040-044). IEEE.
  • Bharathi Raja Chakravarthi, et al. 2020. Corpus Creation for Sentiment Analysis in Code-Mixed Tamil-English Text. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pp. 202–210, Marseille, France. European Language Resources association.
  • Remmiya Devi, G., Veena, P.V et al., 2016. AMRITA-CEN@ FIRE 2016: Code-mix entity extraction for Hindi-English and Tamil-English tweets. In CEUR workshop proceedings (Vol. 1737, pp. 304-308).
  • Haidar, B., Chamoun, M. et al., 2017, October. Multilingual cyberbullying detection system: Detecting cyberbullying in Arabic content. In 2017 1st cyber security in networking conference (CSNet) (pp. 1-8). IEEE.
  • Hande, A., Priyadharshini, R. et al. 2020, December. KanCMD: Kannada CodeMixed dataset for sentiment analysis and offensive language detection. In Proceedings of the Third Workshop on Computational Modeling of People's Opinions, Personality, and Emotion's in Social Media (pp. 54-63).
  • Kumar, A. and Sachdeva, N., 2019. Cyberbullying detection on social multimedia using soft computing techniques: a meta-analysis. Multimedia Tools and Applications, 78, pp.23973-24010.
  • Haque, A.B., Bhushan, B. and Dhiman, G., 2022. Conceptualizing smart city applications: Requirements, architecture, security issues, and emerging trends. Expert Systems, 39(5), p.e12753.
  • Arif, Muhammad. (2021). A Systematic Review of Machine Learning Algorithms in Cyberbullying Detection: Future Directions and Challenges. Journal of Information Security and Cybercrimes Research. 4. 01-26. 10.26735/GBTV9013.
  • Nirmal, N., Sable, P et al., 2021. Automated detection of cyberbullying using machine learning. Int. Res. J. Eng. Technol.(IRJET), pp.2054-2061.
  • Slonje, R., Smith, P. K., & Frisén, A. (2013). The nature of cyberbullying, and strategies for prevention. Computers in human behavior, 29(1), 26-32.

Abstract Views: 164

PDF Views: 0




  • A Survey on Methods for Detecting Cyberbullying in Multilingual Documents

Abstract Views: 164  |  PDF Views: 0

Authors

Renetha J B
1Department of Computer Science and Engineering, LBS Institute of Technology for Women, Thiruvananthapuram, Kerala, India
Deepthi P S
Department of Computer Science and Engineering, LBS Institute of Technology for Women, Trivandrum, Kerala, India

Abstract


Digital technologies are now swallowing the world. People irrespective of the age and gender are influenced by the colourful wings that they provide. Teenagers are the main victims of this digital era. They become addicted to games and the virtual world more quickly than any other age group. Their age is so critical that they are very much sensitive. There is a natural tendency among teenagers to do things so as to catch the attention of others. Sometimes this paves way to bully or harass or embarrass others on the internet or other digital spaces such as social media sites and that causes a negative impact on those who are being targeted then there arises the threat of cyberbullying. Social media users are not merely sticking on English language in their posts or comments but the usage of multilingual code mixing or even code switching is very much prevalent. Surveys have been done among people of different ages in many countries and have demonstrated various consequences of cyberbullying victimisation that lead to change in behaviour and increased anxiety. Researchers identified the necessity of computerbased solutions for determining, preventing, mitigating ow even stopping cyberbullying. This paper is a survey of various computer-based techniques specifically concentrating on machine learning, deep learning and natural language processing that targets to detect cyberbullying in online media.

Keywords


Bullying, Online Media, Code-Mixed Data, Hate Speech, Offensive Speech, Machine Learning, Deep Learning, Natural Language Processing.

References