Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Syntactical Knowledge Based Stemmer for Automatic Document Summarization


Affiliations
1 Department of Electronics Engg., MIT Pune's MAE, Alandi, Pune, Maharashtra, India
2 Department of Electronics Engg., DIAT (Deemed University), Pune, India
     

   Subscribe/Renew Journal


With the rapid growth of the data in the Internet the users are overloaded with huge amounts of information which is more difficult to access large volumes of documents. Automatic text summarization technique is an important activity in the analysis of high volume text documents. Text Summarization is condensing the source text into a shorter version preserving its information content and overall meaning. The proposed system generates a summary for a given input document based on identification and extraction of important sentences in the document. The model will consist of four steps. In first stage, the system decomposes the given text into its constituent sentences. The second stage removes the stop words, stemming the text. Assignment of the POS tag will be done in third stage using dependency grammar. Finally the sentences will be ranked depending on feature terms. The paper presents our work done till the stemming process. The stemmer implemented here promises good results.

Keywords

Stemming, Sentence Ranking, Text Summarization.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 236

PDF Views: 2




  • Syntactical Knowledge Based Stemmer for Automatic Document Summarization

Abstract Views: 236  |  PDF Views: 2

Authors

Dipti Y. Sakhare
Department of Electronics Engg., MIT Pune's MAE, Alandi, Pune, Maharashtra, India
Raj Kumar
Department of Electronics Engg., DIAT (Deemed University), Pune, India

Abstract


With the rapid growth of the data in the Internet the users are overloaded with huge amounts of information which is more difficult to access large volumes of documents. Automatic text summarization technique is an important activity in the analysis of high volume text documents. Text Summarization is condensing the source text into a shorter version preserving its information content and overall meaning. The proposed system generates a summary for a given input document based on identification and extraction of important sentences in the document. The model will consist of four steps. In first stage, the system decomposes the given text into its constituent sentences. The second stage removes the stop words, stemming the text. Assignment of the POS tag will be done in third stage using dependency grammar. Finally the sentences will be ranked depending on feature terms. The paper presents our work done till the stemming process. The stemmer implemented here promises good results.

Keywords


Stemming, Sentence Ranking, Text Summarization.