Automated Stopwords Identification in Punjabi Documents

Rajeev Puri; R. P. S. Bedi; Vishal Goyal

Automated Stopwords Identification in Punjabi Documents

Rajeev Puri ¹, R. P. S. Bedi ¹, Vishal Goyal ²

Affiliations
1 Punjab Technical University, Kapurthala Road, Jalandhar, India
2 Dept of Comp. Sc, Punjabi University, Patiala, India

Abstract
References
Article Metrics
Refbacks

Many information retrieval tasks deal with the classification of huge amount of data before giving final results. The data being processed in IR tasks may or may not be useful for the researchers. There has to be some method to identify such data (called stop words) and remove it from data set before beginning with the IR task. This gives dual benefits – Reducing the overall vector space, thereby leading to performance improvements in terms of execution speed and the relevance of results. The purpose of this paper is to find a suitable, automated method for identification of stop words in Punjabi Text.