Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Bert and Indowordnet Collaborative Embedding for Enhanced Marathi Word Sense Disambiguation


Affiliations
1 School of Computer Sciences, K.B.C. North Maharashtra University, India
     

   Subscribe/Renew Journal


Ambiguity in word meanings is a long-standing challenge in processing natural language. Word sense disambiguation (WSD) deals with this challenge. Prior neural language models make use of recurrent neural network and architecture with long short-term memory. These models process the words in sequence, are slower and not truly bi-directional, so they are not able to capture and represent the contextual meanings of the words, hence they are not competent in contextual semantic representation for WSD. Recent, Bi-Directional Encoder Representation from Transformers (BERT) is long short-term memory-based transformer model that is deeply bi-directional. It uses attention mechanisms, which process and use the relevance of the entire context at a time in both directions, so it is well suited to leverage the meanings in distributed representation for WSD. We have used BERT for obtaining contextual word embedding of context and sense gloss of Marathi language ambiguous word. For this purpose, we have used 282 moderately ambiguous Marathi words catering to 1004 senses distributed over 5282 Marathi sentences harvested by linguists from online Marathi websites. We have calculated semantic similarity between the pair of context and gloss embedding using Minkowski distance family and cosine similarity measures and assigned plausible sense to the given Marathi ambiguous word. Our empirical evaluation shows that the cosine similarity measure outperforms and yields an average disambiguation accuracy of 75.26% for the given Marathi sentence.

Keywords

BERT, Distributional Semantics, Neural Language Modeling, Transfer Learning, Word Sense Disambiguation
Subscription Login to verify subscription
User
Notifications
Font Size

  • E. Agirre and P. Edmonds, “Word Sense Disambiguation Algorithms and Applications”, Springer, 2007.
  • S.S. Patil and B.V. Pawar, “Contrastive Study and Review of Word Sense Disambiguation Techniques”, International Journal on Emerging Technologies, Vol. 11, No. 4, pp. 96-103, 2020.
  • S.S. Patil, R.P. Bhavsar and B.V. Pawar, “Path and Information Content based Structural Word Sense Disambiguation”, Proceedings Communications in Computer and Information Science, Vol. 1483, pp. 341-352, 2022.
  • J. Devlin and K. Toutanova K, “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding”, Proceedings of International Conference on Machine Learning, 2019.
  • B. Bhatt and P. Bhattacharyya, “IndoWordNet and its Linking with Ontology”, Proceddings of International Conference on Natural Language Processing, pp. 1-14, 2011.
  • G. Boleda, “Distributional Semantics and Linguistic Theory”, Proceedings of International Conference on Machine Learning, 2020.
  • A. Raganato and R. Navigli, “Neural Sequence Learning Models for Word Sense Disambiguation”, Proceedings of International Conference on Empirical Methods in Natural Language Processing, pp. 1167-1178, 2017.
  • Y. Heo, S. Kang and J. Seo, “Hybrid Sense Slassification Method for Large-scale Word Sense Disambiguation”, IEEE Access, Vol. 8, pp. 27247-27256, 2021.
  • T. Uslu and W. Hemati, “Fastsense: An Efficient Word Sense Disambiguation Classifier”, Proceedings of 11th International Conference on Language Resources and Evaluation, pp. 1042-1046, 2022.
  • M. Kageback and H. Salomonsson, “Word Sense Disambiguation using a Bidirectional LSTM”, Proceedings of International Conference on Cognitive Aspects of the Lexicon, pp. 51-56, 2016.
  • L. Vial and D. Schwab, “Improving the Coverage and the Generalization Ability of Neural Word Sense Disambiguation through Hypernymy and Hyponymy Relationships”, Proceedings of International Conference on Machine Learning, pp. 1-4, 2020.
  • F. Luo and B. Chang, “Leveraging Gloss Knowledge in NeuralWord Sense Disambiguation by Hierarchical Co-Attention”, Proceedings of International Conference on Empirical Methods in Natural Language Processing, pp. 1402-1411, 2019.
  • F. Luo and Z. Sui, “Incorporating Glosses into NeuralWord Sense Disambiguation”, Proceedings of International Conference on Computational Linguistics, pp. 2473-2282, 2018.
  • V. Hofmann and H. Schutze, “Dynamic Contextualized Word Embeddings”, Proceedings of International Conference on NLP, pp. 6970-6984, 2021.
  • M.E. Peters and M. Gardner, “Deep Contextualized Word Representations”, Proceedings of Conference of the North American Chapter, pp. 2227-2237, 2018.
  • S. Kumar and P. Talukdar, “Zero-Shot Word Sense Disambiguation using Sense Definition Embeddings”, Proceedings of International Conference on Computational Linguistics, pp. 5670-568, 2019.
  • J. Du and M. Sun, “Using BERT for Word Sense Disambiguation”, Proceedings of International Conference on Natural Language Processing, pp. 1-7, 2019.
  • L. Huang and X. Huang, “GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge”, Proceedings of International Conference on Computational Linguistics, pp. 3509-3519, 2018.
  • C. Hadiwinoto, “Improved Word Sense Disambiguation using Pre-Trained Contextualized Word Representations”, Proceedings of International Conference on Computational Linguistics, pp. 5297-5306, 2019.
  • L. Vial and D. Schwab, “Sense Vocabulary Compression through the Semantic Knowledge of WordNet for NeuralWord Sense Disambiguation”, Proceedings of International Conference on Global Wordnet, pp. 108-117, 2019.
  • M. Stoeckel and A. Mehler, “SenseFitting: Sense Level Semantic Specialization of Word Embeddings for Word Sense Disambiguation”, Proceedings of International Conference on Computational Linguistics, pp. 365-366, 2019.
  • M. Bevilacqua and R. Navigli, “Breaking Through the 80% Glass Ceiling: Raising the State of the Art inWord Sense Disambiguation by Incorporating Knowledge Graph Information”, Proceedings of Annual Meeting of the Association for Computational Linguistics, pp. 2854-2864, 2020.
  • S. Bhingardive and P. Bhattacharyya, “Using Word Embeddings for Bilingual Unsupervised WSD”, Proceedings of International Conference on Natural Language Processing, pp. 59-64, 2015.
  • K. Saurav, D. Kanojia and P. Bhattacharyya, “A Passage to India: Pre-Trained Word Embeddings for Indian Languages”, Proceedings of International Conference on Language Resources and Evaluation, pp. 352-357, 2020.
  • I. Guyon, G. Taylor and D. Silver, “Unsupervised and Transfer Learning Challenges in Machine Learning”, Microtome Publishing, 2013.
  • A. Vaswani, L. Jones and A. Gomez, “Attention Is All You Need”, Proceedings of International Conference on Neural Information Processing Systems, pp. 6000-6010, 2017.
  • S. Khan and F.S. Khan, “Transformers in Vision: A Survey”, Proceedings of International Conference on Computational Linguistics, pp. 1-28, 2021.
  • S. Pan and Q. Yang, “A Survey on Transfer Learning”, IEEE Transactions On Knowledge and Data Engineering, Vol. 22, No. 11, pp. 1345-1360, 2020.
  • I. Sutskever and V. Le, “Sequence to Sequence Learning with Neural Networks”, Proceedings of International Conference on Neural Information Processing Systems, pp. 1-9, 2014.
  • V. Raghavan, M. Khapra and A. Kunchukuttan, “AI for Bharat”, Available at https://ai4bharat.org/indic-bert, Accessed on 2022.

Abstract Views: 104

PDF Views: 2




  • Bert and Indowordnet Collaborative Embedding for Enhanced Marathi Word Sense Disambiguation

Abstract Views: 104  |  PDF Views: 2

Authors

Sandip S. Patil
School of Computer Sciences, K.B.C. North Maharashtra University, India
R. P. Bhavsar
School of Computer Sciences, K.B.C. North Maharashtra University, India
B. V. Pawar
School of Computer Sciences, K.B.C. North Maharashtra University, India

Abstract


Ambiguity in word meanings is a long-standing challenge in processing natural language. Word sense disambiguation (WSD) deals with this challenge. Prior neural language models make use of recurrent neural network and architecture with long short-term memory. These models process the words in sequence, are slower and not truly bi-directional, so they are not able to capture and represent the contextual meanings of the words, hence they are not competent in contextual semantic representation for WSD. Recent, Bi-Directional Encoder Representation from Transformers (BERT) is long short-term memory-based transformer model that is deeply bi-directional. It uses attention mechanisms, which process and use the relevance of the entire context at a time in both directions, so it is well suited to leverage the meanings in distributed representation for WSD. We have used BERT for obtaining contextual word embedding of context and sense gloss of Marathi language ambiguous word. For this purpose, we have used 282 moderately ambiguous Marathi words catering to 1004 senses distributed over 5282 Marathi sentences harvested by linguists from online Marathi websites. We have calculated semantic similarity between the pair of context and gloss embedding using Minkowski distance family and cosine similarity measures and assigned plausible sense to the given Marathi ambiguous word. Our empirical evaluation shows that the cosine similarity measure outperforms and yields an average disambiguation accuracy of 75.26% for the given Marathi sentence.

Keywords


BERT, Distributional Semantics, Neural Language Modeling, Transfer Learning, Word Sense Disambiguation

References