Open Access Open Access  Restricted Access Subscription Access

News Classification: A Data Mining Approach


Affiliations
1 Department of Computer Science, Sangola College, Sangola – 413307, Maharashtra, India
2 Department of Computer Science, Shivaji University, Kolhapur – 416004, Maharashtra, India
 

Objectives: Text classification is one of the important applications of data mining. Text classification classifies text documents on the basis of words, phrases, combination of words etc. into predefined class labels. Method/Analysis: Present study classifies news data into four predefined classes namely Business, Entertainment, sports and Technology. For text classification WEKA an open source data mining tool is used. Different classification algorithms are applied on News data set. A comparative study of these algorithms is done based on Accuracy, Time, Errors and ROC to predict the best algorithm for news data set classification. Findings: Present study analyzed result on the basis of accuracy, time, error and ROC curve. Present work concludes that NaïveBayes Multinomial algorithm is best for news classification.

Keywords

Classification Algorithms, Data Mining, Text Classification, WEKA.
User

Abstract Views: 164

PDF Views: 0




  • News Classification: A Data Mining Approach

Abstract Views: 164  |  PDF Views: 0

Authors

Dipak Ramchandra Kawade
Department of Computer Science, Sangola College, Sangola – 413307, Maharashtra, India
Kavita S. Oza
Department of Computer Science, Shivaji University, Kolhapur – 416004, Maharashtra, India

Abstract


Objectives: Text classification is one of the important applications of data mining. Text classification classifies text documents on the basis of words, phrases, combination of words etc. into predefined class labels. Method/Analysis: Present study classifies news data into four predefined classes namely Business, Entertainment, sports and Technology. For text classification WEKA an open source data mining tool is used. Different classification algorithms are applied on News data set. A comparative study of these algorithms is done based on Accuracy, Time, Errors and ROC to predict the best algorithm for news data set classification. Findings: Present study analyzed result on the basis of accuracy, time, error and ROC curve. Present work concludes that NaïveBayes Multinomial algorithm is best for news classification.

Keywords


Classification Algorithms, Data Mining, Text Classification, WEKA.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i46%2F130178