Refine your search
Collections
Co-Authors
Year
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Dharmadhikari, Shweta
- Analysis of Performance of Classifier Algorithms for Different Text Representations
Abstract Views :182 |
PDF Views:2
Our paper provides brief overview of popular text representation techniques along with the analysis of performance of three major text classifiers against the three popular text representations of vector space model, graph based model and NMF based model in the multi label setting. We are also proposing mltcNMF, feature extraction algorithm based on non negative matrix factorization approach in the high dimensional data space. We conducted set of experiments to make comprehensive evaluation of the effects of these text representation approaches using multi label datasets and also measured classification performance of our new algorithm. Our empirical study shows that use of appropriate feature selection strategy in text representation can significantly improves performance of text classification system.
Authors
Affiliations
1 Program in Computer Science, DAU, Indore, IN
2 Devi Ahilya Vishwa Vidyalaya, Indore, IN
3 EKlat-Research, Pune, IN
1 Program in Computer Science, DAU, Indore, IN
2 Devi Ahilya Vishwa Vidyalaya, Indore, IN
3 EKlat-Research, Pune, IN
Source
Data Mining and Knowledge Engineering, Vol 5, No 1 (2013), Pagination: 25-29Abstract
Text representation has a strong impact on the performance of text classification system. Text representation with high and redundant number of features, noisy and irrelevant features often increases training and classification time of text classification system. It also reduces accuracy of system. An appropriate text representation with properly extracted or selected features may lead to high accuracy.Our paper provides brief overview of popular text representation techniques along with the analysis of performance of three major text classifiers against the three popular text representations of vector space model, graph based model and NMF based model in the multi label setting. We are also proposing mltcNMF, feature extraction algorithm based on non negative matrix factorization approach in the high dimensional data space. We conducted set of experiments to make comprehensive evaluation of the effects of these text representation approaches using multi label datasets and also measured classification performance of our new algorithm. Our empirical study shows that use of appropriate feature selection strategy in text representation can significantly improves performance of text classification system.