Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

CAK-NN Algorithm:Cluster and Attribute Weightage-Based Algorithm for Effective Classification


Affiliations
1 Department of Computer Science & Engineering at Rayat & Bahra Institute of Engineering & Bio-Technology, Mohali, India
2 Department of Computer Science & Engineering, RIMIT Institute of Engg. & Technology, Punjab, India
3 Regional Institute of Management & Technology, Mandi Gobindgarh, India
     

   Subscribe/Renew Journal


The task of classification is to assign a new object to a class from a given set of classes based on the attribute values of the object. The k-Nearest Neighbor (k-NN) is one of the simplest classification methods used in data mining and machine learning. Although k-NN can be applied broadly, it has few inherent problems, which is why researchers have proposed different extensions of the k-NN, or even ensemble formulations of k-NN classifiers. In our proposed CAk-NN (cluster and attribute weighted k-NN algorithm) algorithm, weight is assigned to each and every attribute of the training dataset so that the accurate distance matching can be possible. In addition to, clustering the training dataset reduces the execution time that is taken for classification and the resultant clusters are used to classify test instances. For this, we have proposed an attribute weighted k-means clustering algorithm that is used for partition the training dataset. After that, each centroid of the obtained cluster constitutes the sub-sample of input database, which is then used for classification. For testing case, distance measure based on attribute weight is calculated between a test instances with the mean of each cluster of training dataset. According to the computed distance measure, k-nearest neighbor cluster are identified and the class label is assigned if every cluster is from the same class. Otherwise, the relevant data records from the k-nearest cluster are retrieved and k-nearest neighbor data records are identified. Finally, the performance of the proposed CAk-NN algorithm is compared with the k-NN algorithm in terms of computation time and Classification accuracy using IRIS dataset.


Keywords

Classification, Clustering, K-Nearest Neighbor Algorithm, K-Means Clustering Algorithm, Distance Measure, CAK-NN (Cluster and Attribute Weighted K-NN Algorithm).
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 258

PDF Views: 1




  • CAK-NN Algorithm:Cluster and Attribute Weightage-Based Algorithm for Effective Classification

Abstract Views: 258  |  PDF Views: 1

Authors

Parvinder S. Sandhu
Department of Computer Science & Engineering at Rayat & Bahra Institute of Engineering & Bio-Technology, Mohali, India
Dalvinder S. Dhaliwal
Department of Computer Science & Engineering, RIMIT Institute of Engg. & Technology, Punjab, India
S. N. Panda
Regional Institute of Management & Technology, Mandi Gobindgarh, India

Abstract


The task of classification is to assign a new object to a class from a given set of classes based on the attribute values of the object. The k-Nearest Neighbor (k-NN) is one of the simplest classification methods used in data mining and machine learning. Although k-NN can be applied broadly, it has few inherent problems, which is why researchers have proposed different extensions of the k-NN, or even ensemble formulations of k-NN classifiers. In our proposed CAk-NN (cluster and attribute weighted k-NN algorithm) algorithm, weight is assigned to each and every attribute of the training dataset so that the accurate distance matching can be possible. In addition to, clustering the training dataset reduces the execution time that is taken for classification and the resultant clusters are used to classify test instances. For this, we have proposed an attribute weighted k-means clustering algorithm that is used for partition the training dataset. After that, each centroid of the obtained cluster constitutes the sub-sample of input database, which is then used for classification. For testing case, distance measure based on attribute weight is calculated between a test instances with the mean of each cluster of training dataset. According to the computed distance measure, k-nearest neighbor cluster are identified and the class label is assigned if every cluster is from the same class. Otherwise, the relevant data records from the k-nearest cluster are retrieved and k-nearest neighbor data records are identified. Finally, the performance of the proposed CAk-NN algorithm is compared with the k-NN algorithm in terms of computation time and Classification accuracy using IRIS dataset.


Keywords


Classification, Clustering, K-Nearest Neighbor Algorithm, K-Means Clustering Algorithm, Distance Measure, CAK-NN (Cluster and Attribute Weighted K-NN Algorithm).