Open Access Open Access  Restricted Access Subscription Access

N-gram Based Word Sense Disambiguation of Hindi Post Position से (sē) in the context of Hindi to Punjabi Machine Translation System


Affiliations
1 Punjab Technical University, Kapurthala, India
2 Dept. of Computer Science, Punjabi University, Patiala, India
3 MM University, Sadopur, Ambala, India
 

India has many regional languages. Attempts have been made for developing machine translations between these languages, but little success has been reported so far. Analysis of Hindi to Punjabi machine translation system devised by Punjabi University, Patiala, India has found that Hindi post position से (sē) is translated inaccurately being its ambiguous nature, most of the times, as it has eighteen different senses in Punjabi. The overall translation success rate of this system reported as 87.60%, however the translation success rate in respect of this post position से (sē) is only about 2%. In this paper, N-gram approach (along with its smoothing variants) has been applied to improve the accuracy of translation of this post position से (sē) in already developed Hindi to Punjabi Machine Translation System. It has been concluded that bigram approach with Add-One smoothing algorithm gives the best results in improving the accuracy of translation of post position से (sē) from 2% to 85.49%, thus improving the overall machine translation accuracy of the system from 87.60% to 92.30% .

Keywords

Natural Language Processing (NLP), Word Sense Disambiguation (WSD), Machine Translation (MT).
User
Notifications
Font Size

Abstract Views: 244

PDF Views: 0




  • N-gram Based Word Sense Disambiguation of Hindi Post Position से (sē) in the context of Hindi to Punjabi Machine Translation System

Abstract Views: 244  |  PDF Views: 0

Authors

Rakesh Kumar
Punjab Technical University, Kapurthala, India
Vishal Goyal
Dept. of Computer Science, Punjabi University, Patiala, India
Ravinder Khanna
MM University, Sadopur, Ambala, India

Abstract


India has many regional languages. Attempts have been made for developing machine translations between these languages, but little success has been reported so far. Analysis of Hindi to Punjabi machine translation system devised by Punjabi University, Patiala, India has found that Hindi post position से (sē) is translated inaccurately being its ambiguous nature, most of the times, as it has eighteen different senses in Punjabi. The overall translation success rate of this system reported as 87.60%, however the translation success rate in respect of this post position से (sē) is only about 2%. In this paper, N-gram approach (along with its smoothing variants) has been applied to improve the accuracy of translation of this post position से (sē) in already developed Hindi to Punjabi Machine Translation System. It has been concluded that bigram approach with Add-One smoothing algorithm gives the best results in improving the accuracy of translation of post position से (sē) from 2% to 85.49%, thus improving the overall machine translation accuracy of the system from 87.60% to 92.30% .

Keywords


Natural Language Processing (NLP), Word Sense Disambiguation (WSD), Machine Translation (MT).