Syntactical Knowledge Based Stemmer for Automatic Document Summarization

Dipti Y. Sakhare; Raj Kumar

Syntactical Knowledge Based Stemmer for Automatic Document Summarization

Affiliations
1 Department of Electronics Engg., MIT Pune's MAE, Alandi, Pune, Maharashtra, India
2 Department of Electronics Engg., DIAT (Deemed University), Pune, India

With the rapid growth of the data in the Internet the users are overloaded with huge amounts of information which is more difficult to access large volumes of documents. Automatic text summarization technique is an important activity in the analysis of high volume text documents. Text Summarization is condensing the source text into a shorter version preserving its information content and overall meaning. The proposed system generates a summary for a given input document based on identification and extraction of important sentences in the document. The model will consist of four steps. In first stage, the system decomposes the given text into its constituent sentences. The second stage removes the stop words, stemming the text. Assignment of the POS tag will be done in third stage using dependency grammar. Finally the sentences will be ranked depending on feature terms. The paper presents our work done till the stemming process. The stemmer implemented here promises good results.