Open Access Open Access  Restricted Access Subscription Access

Deadwood Detection and Elimination in Text Summarization for Punjabi Language


Affiliations
1 Baba Farid College of Engineering and Technology, Bathinda, Punjab, Pin Code-151001, India
2 University College of Engineering, Punjabi University, Patiala, Punjab, Pin Code-147002, India
 

As the internet is growing rapidly, this has resulted in large amount of information. Text summarization provides shorthand version for such information, which is no longer than half of the original text. This paper proposes a system for detection and removal of Deadwood in summaries for Punjabi language. Deadwood means word or phrase that can be omitted without loss in meaning. Removing it shortens and clarifies the summary. The first step in this process is preprocessing which consists of sentence segmentation and removal of Punjabi stop words and then in the second step weight is assigned to the sentences in the source text .We used five different features for the assignment of weight to the sentences. In the next step the highest scoring sentences are selected to form the summary. In the last step the Deadwood is eliminated and removed from the summary.

Keywords

Deadwood, Phrase, Summary.
User
Notifications
Font Size

Abstract Views: 127

PDF Views: 0




  • Deadwood Detection and Elimination in Text Summarization for Punjabi Language

Abstract Views: 127  |  PDF Views: 0

Authors

Mandeep Kaur
Baba Farid College of Engineering and Technology, Bathinda, Punjab, Pin Code-151001, India
Jagroop Kaur
University College of Engineering, Punjabi University, Patiala, Punjab, Pin Code-147002, India

Abstract


As the internet is growing rapidly, this has resulted in large amount of information. Text summarization provides shorthand version for such information, which is no longer than half of the original text. This paper proposes a system for detection and removal of Deadwood in summaries for Punjabi language. Deadwood means word or phrase that can be omitted without loss in meaning. Removing it shortens and clarifies the summary. The first step in this process is preprocessing which consists of sentence segmentation and removal of Punjabi stop words and then in the second step weight is assigned to the sentences in the source text .We used five different features for the assignment of weight to the sentences. In the next step the highest scoring sentences are selected to form the summary. In the last step the Deadwood is eliminated and removed from the summary.

Keywords


Deadwood, Phrase, Summary.