Open Access Open Access  Restricted Access Subscription Access

Clustering and Feature Specific Sentence Extraction Based Summarization of Multiple Documents


Affiliations
1 Department of CSE, Kongu Engineering College, Tamilnadu, India
 

This paper presents an approach to cluster multiple documents by using document clustering approach and to produce cluster wise summary based on feature profile oriented sentence extraction strategy. Related documents are grouped into same cluster using document clustering algorithm. Feature profile is generated by considering word weight, sentence position, sentence length, sentence centrality, proper nouns in the sentence and numerical data in the sentence. Based on the feature profile sentence score is calculated for each sentence. According to different compression rates sentences are extracted from each cluster and ranked in order of importance based on sentence score. Extracted sentences are arranged in chronological order as in original documents and from this, cluster wise summary can be generated. Experimental results show that the proposed clustering algorithm is efficient and feature profile is used to extract most important sentences from multiple documents.

Keywords

Feature Profile, Multi-Document Summarization, Sentence Extraction, Document Clustering.
User
Notifications
Font Size

Abstract Views: 300

PDF Views: 142




  • Clustering and Feature Specific Sentence Extraction Based Summarization of Multiple Documents

Abstract Views: 300  |  PDF Views: 142

Authors

A. Kogilavani
Department of CSE, Kongu Engineering College, Tamilnadu, India
P. Balasubramani
Department of CSE, Kongu Engineering College, Tamilnadu, India

Abstract


This paper presents an approach to cluster multiple documents by using document clustering approach and to produce cluster wise summary based on feature profile oriented sentence extraction strategy. Related documents are grouped into same cluster using document clustering algorithm. Feature profile is generated by considering word weight, sentence position, sentence length, sentence centrality, proper nouns in the sentence and numerical data in the sentence. Based on the feature profile sentence score is calculated for each sentence. According to different compression rates sentences are extracted from each cluster and ranked in order of importance based on sentence score. Extracted sentences are arranged in chronological order as in original documents and from this, cluster wise summary can be generated. Experimental results show that the proposed clustering algorithm is efficient and feature profile is used to extract most important sentences from multiple documents.

Keywords


Feature Profile, Multi-Document Summarization, Sentence Extraction, Document Clustering.