Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Comparative Analysis of Optimization Algorithms for Document Clustering


Affiliations
1 Department of Master of Computer Application, Dr. Mahalingam College of Engineering & Technology, Pollachi, India
2 Department of Computer science and Engineering, Institute of Road and Transport Technology, Erode., India
     

   Subscribe/Renew Journal


Document clustering or text clustering is an unsupervised technique and it is used to grouping the documents of same context. Document clustering algorithms are widely used in web searching engines to produce results relevant to a query. Today, the information in websites is growing in huge size and it leads to the process of managing, retrieve the required and updated information is a tedious task. Also necessary to obtain the exact information required by the user from the documents. Recently optimization algorithms are introduced and are applied to the clustering algorithms. The Genetic Algorithm and Cuckoo Search algorithms are meta-heuristic optimization algorithms and are used to obtain the optimum solutions. In this paper, Genetic Algorithm and Cuckoo Search algorithm based Domain-specific Keyword Similarity based Knowledgebase Creation algorithm are proposed to optimize the document clustering to answers the question answering system. The experimental were conducted on benchmark datasets and the performance was analyzed in terms of Precision, Recall, F1, Missrate, Fallout and Purity.


Keywords

Cuckoo Search, Document Clustering, Genetic Algorithm, Information Processing Knowledge Base, Text Mining.
User
Subscription Login to verify subscription
Notifications
Font Size


  • Comparative Analysis of Optimization Algorithms for Document Clustering

Abstract Views: 401  |  PDF Views: 3

Authors

K. Karpagam
Department of Master of Computer Application, Dr. Mahalingam College of Engineering & Technology, Pollachi, India
A. Saradha
Department of Computer science and Engineering, Institute of Road and Transport Technology, Erode., India

Abstract


Document clustering or text clustering is an unsupervised technique and it is used to grouping the documents of same context. Document clustering algorithms are widely used in web searching engines to produce results relevant to a query. Today, the information in websites is growing in huge size and it leads to the process of managing, retrieve the required and updated information is a tedious task. Also necessary to obtain the exact information required by the user from the documents. Recently optimization algorithms are introduced and are applied to the clustering algorithms. The Genetic Algorithm and Cuckoo Search algorithms are meta-heuristic optimization algorithms and are used to obtain the optimum solutions. In this paper, Genetic Algorithm and Cuckoo Search algorithm based Domain-specific Keyword Similarity based Knowledgebase Creation algorithm are proposed to optimize the document clustering to answers the question answering system. The experimental were conducted on benchmark datasets and the performance was analyzed in terms of Precision, Recall, F1, Missrate, Fallout and Purity.


Keywords


Cuckoo Search, Document Clustering, Genetic Algorithm, Information Processing Knowledge Base, Text Mining.

References