Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Feature Selection for Text Clustering and Classification


Affiliations
1 Thapar University, Patiala, India
2 CSED, Thapar University, Patiala, India
     

   Subscribe/Renew Journal


The quality of the data is one of the most important factors influencing the performance of any classification or clustering algorithm. The attributes defining the feature space of a given data set can often be inadequate, which make it difficult to discover useful information or desired output. However, even when the original attributes are individually inadequate, it is often possible to combine such attributes in order to construct new ones with greater predictive power. Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, and noise from data to improving result comprehensibility. This paper addresses the task of feature selection for clustering and classification. Here we give a comparative study of variety of classification methods, including Naive Bayes, J48 etc.

Keywords

Classification, Clustering, Feature Selection, Machine Learning.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 235

PDF Views: 2




  • Feature Selection for Text Clustering and Classification

Abstract Views: 235  |  PDF Views: 2

Authors

Kamlesh Dhayal
Thapar University, Patiala, India
Sudesh Kumar
Thapar University, Patiala, India
Shalini Batra
CSED, Thapar University, Patiala, India

Abstract


The quality of the data is one of the most important factors influencing the performance of any classification or clustering algorithm. The attributes defining the feature space of a given data set can often be inadequate, which make it difficult to discover useful information or desired output. However, even when the original attributes are individually inadequate, it is often possible to combine such attributes in order to construct new ones with greater predictive power. Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, and noise from data to improving result comprehensibility. This paper addresses the task of feature selection for clustering and classification. Here we give a comparative study of variety of classification methods, including Naive Bayes, J48 etc.

Keywords


Classification, Clustering, Feature Selection, Machine Learning.