Open Access
Subscription Access
On the Consequence of Variation Measure in K-Modes Clustering Algorithm
Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning Clustering is one of the most important data mining techniques that partitions data according to some similarity criterion. The problems of clustering categorical data have attracted much attention from the data mining research community recently.The original k-means algorithm or known as Lloyd's algorithm, is designed to work primarily on numeric data sets. This prohibits the algorithm from being applied to definite data clustering, which is an integral part of data mining and has attracted much attention recently In this paper delineates increase to the k-modes algorithm for clustering definite data. By modifying a simple corresponding Variation measure for definite entities, a heuristic approach was developed in, which allows the use of the k-modes paradigm to obtain a cluster with strong intra-similarity, and to efficiently cluster large definite data sets. The main aim of this paper is to derive severely the updating formula of the k-modes clustering algorithm with the new Variation measure, and the convergence of the algorithm under the optimization framework.
Keywords
Data Mining, Clustering, K-Means Algorithm, Definite Data.
User
Font Size
Information
Abstract Views: 222
PDF Views: 0