On the Consequence of Variation Measure in K-Modes Clustering Algorithm

Abedalhakeem T. Issa

On the Consequence of Variation Measure in K-Modes Clustering Algorithm

Affiliations
1 Computer Science Department, Shaqra University, Dawadmi Community College, Dawadmi 11911 P.O. Box 18, Saudi Arabia

Abstract
References
Article Metrics
Refbacks

Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning Clustering is one of the most important data mining techniques that partitions data according to some similarity criterion. The problems of clustering categorical data have attracted much attention from the data mining research community recently.The original k-means algorithm or known as Lloyd's algorithm, is designed to work primarily on numeric data sets. This prohibits the algorithm from being applied to definite data clustering, which is an integral part of data mining and has attracted much attention recently In this paper delineates increase to the k-modes algorithm for clustering definite data. By modifying a simple corresponding Variation measure for definite entities, a heuristic approach was developed in, which allows the use of the k-modes paradigm to obtain a cluster with strong intra-similarity, and to efficiently cluster large definite data sets. The main aim of this paper is to derive severely the updating formula of the k-modes clustering algorithm with the new Variation measure, and the convergence of the algorithm under the optimization framework.