CAK-NN Algorithm:Cluster and Attribute Weightage-Based Algorithm for Effective Classification
Subscribe/Renew Journal
The task of classification is to assign a new object to a class from a given set of classes based on the attribute values of the object. The k-Nearest Neighbor (k-NN) is one of the simplest classification methods used in data mining and machine learning. Although k-NN can be applied broadly, it has few inherent problems, which is why researchers have proposed different extensions of the k-NN, or even ensemble formulations of k-NN classifiers. In our proposed CAk-NN (cluster and attribute weighted k-NN algorithm) algorithm, weight is assigned to each and every attribute of the training dataset so that the accurate distance matching can be possible. In addition to, clustering the training dataset reduces the execution time that is taken for classification and the resultant clusters are used to classify test instances. For this, we have proposed an attribute weighted k-means clustering algorithm that is used for partition the training dataset. After that, each centroid of the obtained cluster constitutes the sub-sample of input database, which is then used for classification. For testing case, distance measure based on attribute weight is calculated between a test instances with the mean of each cluster of training dataset. According to the computed distance measure, k-nearest neighbor cluster are identified and the class label is assigned if every cluster is from the same class. Otherwise, the relevant data records from the k-nearest cluster are retrieved and k-nearest neighbor data records are identified. Finally, the performance of the proposed CAk-NN algorithm is compared with the k-NN algorithm in terms of computation time and Classification accuracy using IRIS dataset.
Keywords
Abstract Views: 283
PDF Views: 1