Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Improved Fuzzy C-Means Clustering of Web Usage Data with Genetic Algorithm


Affiliations
1 Department of Computer Science, Madurai Kamaraj University, Madurai, Tamil Nadu, India
2 Department of Microprocessor, Madurai Kamaraj University, Madurai, Tamil Nadu, India
     

   Subscribe/Renew Journal


Clustering is one of the important functions in web usage mining. Web usage mining involves application of data mining techniques to discover usage patterns from the web data. Cluster analysis aims at identifying groups of similar objects and, therefore helps to discover distribution of patterns and interesting correlations in large data sets. These methods are not only major tools to uncover the underlying structures of a given data set, but also promising tools to uncover local input-output relations of a complex system. Fuzzy C-means (FCM) is one of the most widely used fuzzy clustering algorithms in real world applications. However there are two major limitations that exist in this method. The first is that a predefined number of clusters must be given in advance. The second is that the FCM technique can get stuck in sub-optimal solutions. In this paper,we have proposed a new framework to improve the web sessions' cluster quality from fuzzy c-means clustering using Genetic Algorithm (GA). Initially the fuzzy c-means algorithm is used to cluster the user sessions. And in the second step, we have proposed a GA based refinement algorithm to improve the cluster quality. The proposed algorithm is tested with web access logs collected from the Internet Traffic Archive (ITA) and shows that refined initial starting points and post processing refinement of clusters indeed lead to improved solutions.

Keywords

Web Usage Mining, Clustering, Fuzzy C-Means, Genetic Algorithm.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 211

PDF Views: 3




  • Improved Fuzzy C-Means Clustering of Web Usage Data with Genetic Algorithm

Abstract Views: 211  |  PDF Views: 3

Authors

N. Sujatha
Department of Computer Science, Madurai Kamaraj University, Madurai, Tamil Nadu, India
K. Iyakutti
Department of Microprocessor, Madurai Kamaraj University, Madurai, Tamil Nadu, India

Abstract


Clustering is one of the important functions in web usage mining. Web usage mining involves application of data mining techniques to discover usage patterns from the web data. Cluster analysis aims at identifying groups of similar objects and, therefore helps to discover distribution of patterns and interesting correlations in large data sets. These methods are not only major tools to uncover the underlying structures of a given data set, but also promising tools to uncover local input-output relations of a complex system. Fuzzy C-means (FCM) is one of the most widely used fuzzy clustering algorithms in real world applications. However there are two major limitations that exist in this method. The first is that a predefined number of clusters must be given in advance. The second is that the FCM technique can get stuck in sub-optimal solutions. In this paper,we have proposed a new framework to improve the web sessions' cluster quality from fuzzy c-means clustering using Genetic Algorithm (GA). Initially the fuzzy c-means algorithm is used to cluster the user sessions. And in the second step, we have proposed a GA based refinement algorithm to improve the cluster quality. The proposed algorithm is tested with web access logs collected from the Internet Traffic Archive (ITA) and shows that refined initial starting points and post processing refinement of clusters indeed lead to improved solutions.

Keywords


Web Usage Mining, Clustering, Fuzzy C-Means, Genetic Algorithm.