Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Data Clustering Using Visual Assessment of Cluster Tendency Algorithm of Data Partitioning Methods


Affiliations
1 Department of IT, Vivekananda Institute of Engineering and Technology for Women, Elayampalayam, Namakkal-637205, Tamilnadu, India
2 Department of IT, Bannari Amman Institute of Technology, Sathyamangalam, Erode-637205, Tamilnadu, India
     

   Subscribe/Renew Journal


This paper proposes a new innovative algorithm is called Visual assessment of cluster tendency, uses a visual approach to find the number of clusters in data. The Visual Assessment of (cluster) Tendency (VAT) method readily displays cluster tendency for small data sets as grayscale images, but is too computationally costly for larger data sets. We first study an important visual methods have been widely studied and used in data cluster analysis. The basis of the method is to regard D as a subset of known values that is part of a larger, unknown N×N dissimilarity matrix, and then impute the missing values from D. The VAT algorithm generally represent D as an N×N Image I(̅D) where the objects are reordered to reveal hidden cluster structure along the diagonal of the image. This paper addresses the limitation by proposing a VAT algorithm, where D is mapped ̅D in a graph embedding space and then reordered to ̅D using VAT algorithm. Two important points: i) because VAT is scalable by sVAT to data sets of arbitrary size, and because coVAT depends explicitly on VAT, this new approach is immediately scalable to say, the sVAT model, which works for even very large(unloadable) data sets without alteration; and ii) VAT, sVAT and coVAT are autonomous, parameter free models-no “hidden values” are needed to make them work. A sampling-based extended scheme is also proposed to enable visual cluster analysis for large data sets.Extensive experimental results on several synthetic and real-world data sets validate our VAT algorithms.

Keywords

Clustering, Cluster Analysis, Cluster Tendency, Hidden Values, VAT, sVAT, coVAT.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 517

PDF Views: 1




  • A Data Clustering Using Visual Assessment of Cluster Tendency Algorithm of Data Partitioning Methods

Abstract Views: 517  |  PDF Views: 1

Authors

R. Tamilselvan
Department of IT, Vivekananda Institute of Engineering and Technology for Women, Elayampalayam, Namakkal-637205, Tamilnadu, India
V. Hariharaprabu
Department of IT, Vivekananda Institute of Engineering and Technology for Women, Elayampalayam, Namakkal-637205, Tamilnadu, India
R. Bhaskaran
Department of IT, Vivekananda Institute of Engineering and Technology for Women, Elayampalayam, Namakkal-637205, Tamilnadu, India
C. Palanisamy
Department of IT, Bannari Amman Institute of Technology, Sathyamangalam, Erode-637205, Tamilnadu, India

Abstract


This paper proposes a new innovative algorithm is called Visual assessment of cluster tendency, uses a visual approach to find the number of clusters in data. The Visual Assessment of (cluster) Tendency (VAT) method readily displays cluster tendency for small data sets as grayscale images, but is too computationally costly for larger data sets. We first study an important visual methods have been widely studied and used in data cluster analysis. The basis of the method is to regard D as a subset of known values that is part of a larger, unknown N×N dissimilarity matrix, and then impute the missing values from D. The VAT algorithm generally represent D as an N×N Image I(̅D) where the objects are reordered to reveal hidden cluster structure along the diagonal of the image. This paper addresses the limitation by proposing a VAT algorithm, where D is mapped ̅D in a graph embedding space and then reordered to ̅D using VAT algorithm. Two important points: i) because VAT is scalable by sVAT to data sets of arbitrary size, and because coVAT depends explicitly on VAT, this new approach is immediately scalable to say, the sVAT model, which works for even very large(unloadable) data sets without alteration; and ii) VAT, sVAT and coVAT are autonomous, parameter free models-no “hidden values” are needed to make them work. A sampling-based extended scheme is also proposed to enable visual cluster analysis for large data sets.Extensive experimental results on several synthetic and real-world data sets validate our VAT algorithms.

Keywords


Clustering, Cluster Analysis, Cluster Tendency, Hidden Values, VAT, sVAT, coVAT.