Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Document Clustering Using Firefly Algorithm


Affiliations
1 Guru Nanak Dev University Regional Campus, Jalandhar
2 Guru Nanak Dev University Regional Campus, Jalandhar, India
     

   Subscribe/Renew Journal


Document clustering is an important technique that has been widely employed in Information Retrieval (IR). Various clustering techniques have been reported, but the effectiveness of most of these techniques relies on the initial value of k clusters. Such an approach may not be suitable as we may not have prior knowledge on the collection of documents. To date, there are various swarm based clustering techniques proposed to address such problem including this paper that explores the adaptation of Firefly Algorithm (FA) in document clustering. We extend the work on Gravitation Firefly Algorithm (GFA) by introducing a relocate mechanism that relocates assigned documents, if necessary. The newly proposed clustering algorithm, known as GFA_R, is then tested on a benchmarked dataset obtained from the 20Newsgroups. Experimental results on external and relative quality metrics for the GFA_R are compared against the one obtained using the standard GFA. It is learned that by extending GFA to becoming GFA_R, a better quality clustering is obtained.

Keywords

Clustering Process, Data Mining, Document Clustering, Firefly Algorithm, Gravitational Firefly Algorithm.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 329

PDF Views: 3




  • Document Clustering Using Firefly Algorithm

Abstract Views: 329  |  PDF Views: 3

Authors

Seerat Preet Kaur
Guru Nanak Dev University Regional Campus, Jalandhar
Neena Madan
Guru Nanak Dev University Regional Campus, Jalandhar, India

Abstract


Document clustering is an important technique that has been widely employed in Information Retrieval (IR). Various clustering techniques have been reported, but the effectiveness of most of these techniques relies on the initial value of k clusters. Such an approach may not be suitable as we may not have prior knowledge on the collection of documents. To date, there are various swarm based clustering techniques proposed to address such problem including this paper that explores the adaptation of Firefly Algorithm (FA) in document clustering. We extend the work on Gravitation Firefly Algorithm (GFA) by introducing a relocate mechanism that relocates assigned documents, if necessary. The newly proposed clustering algorithm, known as GFA_R, is then tested on a benchmarked dataset obtained from the 20Newsgroups. Experimental results on external and relative quality metrics for the GFA_R are compared against the one obtained using the standard GFA. It is learned that by extending GFA to becoming GFA_R, a better quality clustering is obtained.

Keywords


Clustering Process, Data Mining, Document Clustering, Firefly Algorithm, Gravitational Firefly Algorithm.