Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Enhancing the Performance of Hybrid Clustering of Documents Using Artificial Neural Network Based Approach


Affiliations
1 Kalaignarkarunanidhi Institute of Technology, India
2 VLB Janakiammal College of Engg & Tech, India
     

   Subscribe/Renew Journal


Clustering and classification have been useful and active areas of machine learning research that promise to help us cope with the problem of information overload on the Internet. BIRCH is a clustering algorithm designed  to  operate  under  the  assumption  "the  amount  of memory  available  is  limited,  whereas  the  dataset  can  be arbitrary large". The algorithm generates "a compact dataset summary" minimizing the I/O cost involved .An application of k-means requires an initial partition to be supplied as an input. To generate a "good" initial partition of the "summaries" a clustering algorithm, PDDP can be used. Also we compare the performance of traditional K-Means algorithm with a new artificial neural network based clustering method. Experimental results show that the new method works more accurately than K-Means.


Keywords

BIRCH, PDDP, K-Means, ANN Based Clustering, Rand Index.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 221

PDF Views: 1




  • Enhancing the Performance of Hybrid Clustering of Documents Using Artificial Neural Network Based Approach

Abstract Views: 221  |  PDF Views: 1

Authors

M. Deepa
Kalaignarkarunanidhi Institute of Technology, India
P. Tamijeselvy
VLB Janakiammal College of Engg & Tech, India

Abstract


Clustering and classification have been useful and active areas of machine learning research that promise to help us cope with the problem of information overload on the Internet. BIRCH is a clustering algorithm designed  to  operate  under  the  assumption  "the  amount  of memory  available  is  limited,  whereas  the  dataset  can  be arbitrary large". The algorithm generates "a compact dataset summary" minimizing the I/O cost involved .An application of k-means requires an initial partition to be supplied as an input. To generate a "good" initial partition of the "summaries" a clustering algorithm, PDDP can be used. Also we compare the performance of traditional K-Means algorithm with a new artificial neural network based clustering method. Experimental results show that the new method works more accurately than K-Means.


Keywords


BIRCH, PDDP, K-Means, ANN Based Clustering, Rand Index.