Open Access Open Access  Restricted Access Subscription Access

A Survey on Paraphrase Detection and Generation Techniques


Affiliations
1 Department of Computer Science and Applications, Sant Baba Bhag Singh University, Jalandhar, India
2 Department of Computer Applications, DAV University, Jalandhar, India
 

Whenever “the same thing,” need to be expressed using different ways or by various alternatives an automated paraphrase generation mechanism would be useful. One reason why paraphrase generation systems have been difficult to build is because paraphrases are hard to define. Although the strict interpretation of the term “paraphrase” is quite narrow because it requires exactly identical meaning, in linguistics literature paraphrases are most often characterized by an approximate equivalence of semantics across sentences or phrases. This paper presents a survey of paraphrase generation techniques for Indian and foreign languages.

Keywords

Paraphrasing, Sentence Simplification, Sentence Fusion, Sentence Compression.
User
Notifications
Font Size

  • . Kevin Knight and Daniel Marcu. 2000. Statisticsbased summarization-step one: Sentence compression. In Proceedings of AAAI-IAAI.
  • . Trevor Cohn and Mirella Lapata. 2008. Sentence compression beyond word deletion. In Proceedings of COLING.
  • . Katja Filippova and Michael Strube. 2008. Dependency tree based sentence compression. In Proceedings of INLG
  • . Emily Pitler. 2010. Methods for sentence compression. Technical report, University of Pennsylvania.
  • . Katja Filippova, Enrique Alfonseca, Carlos Colmenares, Lukasz Kaiser, and Oriol Vinyals. 2015. Sentence compression by deletion with LSTMs. In Proceedings of EMNLP.
  • . Kristina Toutanova, Chris Brockett, Ke M. Tran, and Saleema Amershi. 2016. A dataset and evaluation metrics for abstractive compression of sentences and short paragraphs. In Proceedings of EMNLP.
  • . Kathleen McKeown, Sara Rosenthal, Kapil Thadani, and Coleman Moore. 2010. Time-efficient creation of an accurate sentence fusion corpus. In Proceedings of NAACL-HLT.
  • . Katja Filippova. 2010. Multi-sentence compression: Finding shortest paths in word graphs. In Proceedings of COLING.
  • . Kathleen McKeown, Sara Rosenthal, Kapil Thadani, and Coleman Moore. 2010. Time-efficient creation of an accurate sentence fusion corpus. In Proceedings of NAACL-HLT.
  • . Mark Dras. 1999. Tree adjoining grammar and the reluctant paraphrasing of text. Ph.D. thesis, Macquarie University, Australia
  • . Regina Barzilay and Kathleen R McKeown. 2001. Extracting paraphrases from a parallel corpus. In Proceedings of ACL.
  • . Colin Bannard and Chris Callison-Burch. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of ACL.
  • . Sander Wubben, Antal Van Den Bosch, and Emiel Krahmer. 2010. Paraphrase generation as monolingual translation: Data and evaluation. In Proceedings of INLG.
  • . Jonathan Mallinson, Rico Sennrich, and Mirella Lapata. 2017. Paraphrasing revisited with neural machine translation. In Proceedings of EACL.
  • . Advaith Siddharthan. 2010. Complex lexico-syntactic reformulation of sentences using typed dependency representations. In Proceedings of INLG.
  • . Zhemin Zhu, Delphine Bernhard, and Iryna Gurevych. 2010. A monolingual tree-based translation model for sentence simplification. In Proceedings of COLING.
  • . Kristian Woodsend and Mirella Lapata. 2011. Learning to simplify sentences with quasi-synchronous grammar and integer programming. In Proceedings of EMNLP.
  • . Sander Wubben, Antal van den Bosch, and Emiel Krahmer. 2012. Sentence simplification by monolingual machine translation. In Proceedings of ACL.
  • . Shashi Narayan and Claire Gardent. 2014. Hybrid simplification using deep semantics and machine translation. In Proceedings of ACL.
  • . Wei Xu, Chris Callison-Burch, and Courtney Napoles. 2015. Problems in current text simplification research: New data can help. Transactions of the Association for Computational Linguistics, 3:283–297.
  • . Kristina Toutanova, Chris Brockett, Ke M. Tran, and Saleema Amershi. 2016. A dataset and evaluation metrics for abstractive compression of sentences and short paragraphs. In Proceedings of EMNLP
  • . Xingxing Zhang and Mirella Lapata. 2017. Sentence simplification with deep reinforcement learning. In Proceedings of EMNLP.
  • . Bautista, Susana, et al. "An approach to treat numerical information in the text simplification process." Universal Access in the Information Society 16.1 (2017): 85-102.
  • . Stajner, Sanja, Biljana Drndarevic, and Horacio Saggion. "Corpus-based sentence deletion and split decisions for Spanish text simplification." Computación y Sistemas 17.2 (2013).
  • . CENTAL, ILC. "Syntactic sentence simplification for French." Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR)@ EACL. 2014.
  • . Inui, Kentaro, et al. "Text simplification for reading assistance: a project note." Proceedings of the second international workshop on Paraphrasing-Volume 16. Association for Computational Linguistics, 2003.
  • . Ma, Shuming, and Xu Sun. "A semantic relevance based neural network for text summarization and text simplification." arXiv preprint arXiv:1710.02318 (2017).
  • . Zhang, Xingxing, and Mirella Lapata. "Sentence Simplification with Deep Reinforcement Learning." arXiv preprint arXiv:1703.10931 (2017).
  • . Lee, John, and J. Buddhika K. Pathirage Don. "Splitting Complex English Sentences." Proceedings of the 15th International Conference on Parsing Technologies. 2017.
  • . Petersen, Sarah E., and Mari Ostendorf. "Text simplification for language learners: a corpus analysis." Workshop on Speech and Language Technology in Education. 2007.
  • . Sethi, Nandini, et al. "A novel Approach to Paraphrase Hindi sentences using Natural language Processing." Indian Journal of Science and Technology 9.28 (2016).
  • . Narayan, Shashi, et al. "Split and rephrase." arXiv preprint arXiv:1707.06971 (2017).
  • . Wubben, Sander, Antal Van Den Bosch, and Emiel Krahmer. "Paraphrase generation as monolingual translation: Data and evaluation." Proceedings of the 6th International Natural Language Generation Conference. Association for Computational Linguistics, 2010.
  • . Callison-Burch, C., and C. Bannard. "Paraphrasing with bilingual parallel corpora." Proceedings of 43th Annual Meeting of the Association for Computational Linguistics. 2005.
  • . Bingel, Joachim, and Anders Søgaard. "Text simplification as tree labeling." The 54th Annual Meeting of the Association for Computational Linguistics. 2016.
  • . Narayan, Shashi, and Claire Gardent. "Unsupervised sentence simplification using deep semantics." arXiv preprint arXiv:1507.08452 (2015).
  • . Knight, Kevin, and Daniel Marcu. "Summarization beyond sentence extraction: A probabilistic approach to sentence compression." Artificial Intelligence 139.1 (2002): 91-107.
  • . Cheung, Andrew KF. "Paraphrasing exercises and training for Chinese to English consecutive interpreting." FORUM. Revue internationaled’interprétation et de traduction/International Journal of Interpretation and Translation. Vol. 14. No. 1. John Benjamins Publishing Company, 2016.Roig, Miguel. "Plagiarism and paraphrasing criteria of college and university professors." Ethics & Behavior 11.3 (2001): 307-323.
  • . Roig, Miguel. "Plagiarism and paraphrasing criteria of college and university professors." Ethics & Behavior 11.3 (2001): 307-323.
  • . Rogerson, Ann M., and Grace McCarthy. "Using Internet based paraphrasing tools: Original work, patchwriting or facilitated plagiarism?." International Journal for Educational Integrity 13.1 (2017): 2.
  • . Hyytinen, Heidi, Erika Löfström, and Sari Lindblom-Ylänne. "Challenges in argumentation and paraphrasing among beginning students in educational sciences." Scandinavian Journal of Educational Research 61.4 (2017): 411-429.
  • . Bokharaeian, B., and A. Diaz. "Extraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency." Journal of AI and Data Mining 4.2 (2016): 203-212.
  • . Kim, Mi-Young, et al. "Legal Question Answering Using Paraphrasing and Entailment Analysis." Tenth International Workshop on Juris-informatics (JURISIN). 2016.
  • . Sethi, Nandini, et al. "A novel Approach to Paraphrase Hindi sentences using Natural language Processing." Indian Journal of Science and Technology 9.28 (2016).
  • . Hagaman, Jessica L., Kathryn J. Casey, and Robert Reid. "Paraphrasing strategy instruction for struggling readers." Preventing School Failure: Alternative Education for Children and Youth 60.1 (2016): 43-52.
  • . Garg, Urvashi, and Vishal Goyal. "Maulik: A Plagiarism Detection Tool for Hindi Documents." Indian Journal of Science and Technology 9.12 (2016).
  • . Mrabet, Yassine, et al. "Aligning texts and knowledge bases with semantic sentence simplification." (2016): 29-36.

Abstract Views: 209

PDF Views: 0




  • A Survey on Paraphrase Detection and Generation Techniques

Abstract Views: 209  |  PDF Views: 0

Authors

Ravinder Mohan Jindal
Department of Computer Science and Applications, Sant Baba Bhag Singh University, Jalandhar, India
Vijay Rana
Department of Computer Science and Applications, Sant Baba Bhag Singh University, Jalandhar, India
Sanjeev Sharma
Department of Computer Applications, DAV University, Jalandhar, India

Abstract


Whenever “the same thing,” need to be expressed using different ways or by various alternatives an automated paraphrase generation mechanism would be useful. One reason why paraphrase generation systems have been difficult to build is because paraphrases are hard to define. Although the strict interpretation of the term “paraphrase” is quite narrow because it requires exactly identical meaning, in linguistics literature paraphrases are most often characterized by an approximate equivalence of semantics across sentences or phrases. This paper presents a survey of paraphrase generation techniques for Indian and foreign languages.

Keywords


Paraphrasing, Sentence Simplification, Sentence Fusion, Sentence Compression.

References