Using Sentence Simplification to Generate Paraphrase for Low Resource Punjabi Language

Ravinder Mohan Jindal; Leekha Jindal; Sanjeev Kumar Sharma

Using Sentence Simplification to Generate Paraphrase for Low Resource Punjabi Language

Ravinder Mohan Jindal ¹, Leekha Jindal ¹, Sanjeev Kumar Sharma ²

Affiliations
1 Research Scholar, SBBS University, Jalandhar, India
2 Associate Professor, DAV University, Jalandhar, India

Abstract
References
Article Metrics
Refbacks

The field of natural language processing is growing in computer science, and generating paraphrases is a difficult task, especially for languages like Hindi, Punjabi, and Urdu, which are morphologically rich and have limited resources. This research article focuses on generating paraphrases for Punjabi, a morphologically rich Indian language, using a sentence simplification approach. The author employed several sentence simplification algorithms to simplify long Punjabi sentences and used antonym-synonym replacement to generate the paraphrases. The sentence simplification component of the system achieved a precision of 100%, recall of 95%, and an f-measure of 97.43% when tested with a set of data. The developed system's performance was analyzed using various complexity measurement parameters, and it was observed that a combination of lexical and syntactic simplifications yielded the best results.

Keywords

NLP, Punjabi Language Processing, Paraphrasing, Syntactic Simplification, Lexical Simplification.

I-Scholar

Journal Help

User

Notifications

Journal Content
Browse

Font Size

Information

. Lehal, G. S. (2007). Design and implementation of Punjabi spell checker. International Journal of Systemics, Cybernetics and Informatics, 3(8), 70-75.

. Gill, M. S., Lehal, G. S., & Joshi, S. S. (2008). A punjabi grammar checker. In Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II.

. Gill, M. S., Lehal, G. S., & Joshi, S. S. (2009). Part of speech tagging for grammar checking of Punjabi. The Linguistic Journal, 4(1), 6-21.

. Singh, D. M. (2010). A Punjabi Morphological Analyzer and Generator. Advanced Centre for Technical Development of Punjabi Language, Literature and Culture, Punjabi University.

. Lehal, G. S. (2009). A Gurmukhi to Shahmukhi transliteration system. In proceedings of ICON-2009: 7th international conference on Natural Language Processing (pp. 167-173).

. Goyal, V., &Lehal, G. S. (2009). Hindi-Punjabi Machine Transliteration System (For Machine Translation System). George Ronchi Foundation Journal, Italy, 64(1), 2009.

. Josan, G. S., &Lehal, G. S. (2008, August). A Punjabi to Hindi machine translation system. In 22nd International Conference on on Computational Linguistics: Demonstration Papers (pp. 157-160). Association for Computational Linguistics.

. Lehal, G. S., & Singh, C. (2000, September). A Gurmukhi script recognition system. In Proceedings 15th International Conference on Pattern Recognition. ICPR-2000 (Vol. 2, pp. 557-560). IEEE.

. Gupta, V., &Lehal, G. S. (2012, December). Automatic Punjabi text extractive summarization system. In Proceedings of COLING 2012: Demonstration Papers (pp. 191-198).

. Kevin Knight and Daniel Marcu. 2000. Statisticsbased summarization-step one: Sentence compression. In Proceedings of AAAI-IAAI.

. Trevor Cohn and Mirella Lapata. 2008. Sentence compression beyond word deletion. In Proceedings of COLING.

. Katja Filippova and Michael Strube. 2008. Dependency tree based sentence compression. In Proceedings of INLG

. Emily Pitler. 2010. Methods for sentence compression. Technical report, University of Pennsylvania.

. Katja Filippova, Enrique Alfonseca, Carlos Colmenares, Lukasz Kaiser, and Oriol Vinyals. 2015. Sentence compression by deletion with LSTMs. In Proceedings of EMNLP.

. Kristina Toutanova, Chris Brockett, Ke M. Tran, and SaleemaAmershi. 2016. A dataset and evaluation metrics for abstractive compression of sentences and short paragraphs. In Proceedings of EMNLP.

. Kathleen McKeown, Sara Rosenthal, Kapil Thadani, and Coleman Moore. 2010. Time-efficient creation of an accurate sentence fusion corpus. In Proceedings of NAACL-HLT.

. Katja Filippova. 2010. Multi-sentence compression: Finding shortest paths in word graphs. In Proceedings of COLING.

. Dras, Mark. 1997a. Representing Paraphrases Using S-TAGs. Proceedings of the 35th Meeting of the Association for Computational Linguistics, 516-518.

. Mark Dras. 1999. Tree adjoining grammar and the reluctant paraphrasing of text. Ph.D. thesis, Macquarie University, Australia

. Regina Barzilay and Kathleen R McKeown. 2001. Extracting paraphrases from a parallel corpus. In Proceedings of ACL.

. Colin Bannard and Chris Callison-Burch. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of ACL.

. Sander Wubben, Antal Van Den Bosch, and Emiel Krahmer. 2010. Paraphrase generation as monolingual translation: Data and evaluation. In Proceedings of INLG.

. Jonathan Mallinson, Rico Sennrich, and Mirella Lapata. 2017. Paraphrasing revisited with neural machine translation. In Proceedings of EACL.

. AdvaithSiddharthan. 2010. Complex lexico-syntactic reformulation of sentences using typed dependency representations. In Proceedings of INLG.

Abstract Views: 219

PDF Views: 0

Research Cell: An International Journal of Engineering Sciences

Using Sentence Simplification to Generate Paraphrase for Low Resource Punjabi Language

Keywords

Using Sentence Simplification to Generate Paraphrase for Low Resource Punjabi Language

Authors

Abstract

Keywords

References

Username
Password
Remember me

Username
Password
Remember me