Open Access Open Access  Restricted Access Subscription Access

Statistical and Analytical Study of Guided Abstractive Text Summarization


Affiliations
1 Department of Computer Science and Engineering, Jawaharlal Nehru Technological University, Kakinada, India
2 Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, Bengaluru, India
3 JNTUA College of Engineering, Jawaharlal Nehru Technological University, Anantapur, India
 

The process of creating condensed version of given text document by collecting only the important information in it is called abstractive summarization. This involves structuring the information into sentences which are simple and easy to understand. This article presents the analytical study of the process that generates abstractive summary using unified model with attribute based information extraction (IE) rules and class based templates. Classification of the document into several categories is achieved by term frequency/ inverse document frequency (TF/IDF) rules. To generate the information intensive summaries, we use templates for sentence generation. The IE rules are designed to address the complexities involved in Indian regional languages. This paper statistically analyzes the adaptation of the methodology over multiple Indian languages and many document categories. Comparisons between abstractive and extractive summaries are also presented.

Keywords

Abstractive and Extractive Text Summarizations, Information Extraction, Language Parsing and Understanding, Template Selection, Template-Based Generation.
User
Notifications
Font Size

  • Kumar, M., Das, D. and Rudnicky, A. I., Summarizing non-textual events with a ‘briefing’ focus. In Proceedings of Recherche d’Information Assistee par Ordinateur, Pittsburgh, USA, 30 May– 1 June 2007.
  • Jayashree, R., Srikanta Murthy, K. and Sunny, K., Keyword extraction based summarization of categorized Kannada text documents. Int. J. Soft Comput., 2011, 2(4).
  • Sarkar, K., Bengali text summarization by sentence extraction. In Proceedings of International Conference on Business and Information Management, NIT Durgapur, 2012, pp. 233–245.
  • Embar, V. R., Deshpande, S. R., Vaishnavi, A. K., Jain, V., Kallimani, J. S., sArAmsha – a Kannada abstractive summarizer. In Proceedings of International Conference on Advances in Computing, Communications and Informatics, Mysore, 22–25 August 2013.
  • Das, A. and Bandyopadhyay, S., Syntactic sentence fusion techniques for Bengali. In Proceedings of International Journal of Computer Science and Information Technologies, 2011, vol. 2, no. 1, pp. 494–503.
  • Kallimani, J. S., Srinivasa, K. G., Eswara Reddy, B., Information retrieval by text summarization for an Indian regional language. In 6th International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, IEEE NLP-KE 2010, 21–23 August 2010, IEEE Catalog Number: CFP10811-PRT, ISBN:978-1-4244-6897-3, pp. 596–599.
  • Kallimani, J. S., Srinivasa, K. G. and Eswara Reddy, B., Information extraction by an abstractive text summarization for an Indian regional language. In 7th International Conference on Natural Language Processing and Knowledge Engineering, Tokushima, Japan, IEEE NLP-KE 2011, 27–29 November 2011.
  • Genest, P.-E. and Lapalme, G., Text generation for abstractive summarization. In Proceedings of the Third Text Analysis Conference, National Institute of Standards and Technology, Maryland, USA, 2010.
  • Reddy, S. and Sharoff, S., Cross language POS taggers (and other tools) for Indian languages: an experiment with Kannada using Telugu resources. In Proceedings of IJCNLP Workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. Chiang Mai, Thailand. 2011.
  • John Dragomir R. Radev, Hovy, E. and McKeown, K., Introduction to the Special Issue on Summarization, Association for Computational Linguistics, 2002, vol. 28, no. 4; doi: http://dx.doi. org/10.1162/089120102762671927
  • Bruce Hahn, U. and Mani, I., The challenges of automatic summarization. IEEE-Comput., 2000, 33(11), 29–36; doi: http://dx.doi. org/10.1109/2.881692
  • George, A., Miller, WordNet: a lexical database for English. Commun. ACM, 1995, 38(11), 39–41; doi:http://dx.doi.org/10. 1145/219717.219748
  • Gatt and Reiter, E., Simple NLG: a realization engine for practical applications. In Proceedings of ENLG, 2009.

Abstract Views: 331

PDF Views: 167




  • Statistical and Analytical Study of Guided Abstractive Text Summarization

Abstract Views: 331  |  PDF Views: 167

Authors

Jagadish S. Kallimani
Department of Computer Science and Engineering, Jawaharlal Nehru Technological University, Kakinada, India
K. G. Srinivasa
Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, Bengaluru, India
B. Eswara Reddy
JNTUA College of Engineering, Jawaharlal Nehru Technological University, Anantapur, India

Abstract


The process of creating condensed version of given text document by collecting only the important information in it is called abstractive summarization. This involves structuring the information into sentences which are simple and easy to understand. This article presents the analytical study of the process that generates abstractive summary using unified model with attribute based information extraction (IE) rules and class based templates. Classification of the document into several categories is achieved by term frequency/ inverse document frequency (TF/IDF) rules. To generate the information intensive summaries, we use templates for sentence generation. The IE rules are designed to address the complexities involved in Indian regional languages. This paper statistically analyzes the adaptation of the methodology over multiple Indian languages and many document categories. Comparisons between abstractive and extractive summaries are also presented.

Keywords


Abstractive and Extractive Text Summarizations, Information Extraction, Language Parsing and Understanding, Template Selection, Template-Based Generation.

References





DOI: https://doi.org/10.18520/cs%2Fv110%2Fi1%2F69-72