Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Novel Clustering Data Based on K-Means


Affiliations
1 Department of CSE, SBCE, Khammam, India
2 Department of CSE, Mother Teresa Institute of Science and Technology, Sattupally, India
3 Department of CSE, KITS, Khammam, India
     

   Subscribe/Renew Journal


In this paper a new algorithm for clustering symbolic data based on K-Means algorithm is proposed. This new algorithm allows the data entry and the membership degree to be intervals. In our approach, we propose a dynamic document clustering based on structured MARDL technique. In this method, each document is assigned a weight by term frequency and inverse document frequency method using cosine similarity measure and then, the documents are first separated into clusters using k-Means method. The largest cluster will split and forms two sub clusters and this step would be repeated for many times until clusters formed are with high similarity. In addition, our approach tends to capture the intrinsic structure of a data set, e.g., the number of clusters. Simulation results demonstrate that our approach yields favorite results for a variety of temporal data clustering tasks. As our weighted cluster ensemble algorithm can combine any input partitions to generate a clustering ensemble, we also investigate its limitation by formal analysis and empirical studies.

Keywords

Clustering, K-Means, MARDAL.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 278

PDF Views: 2




  • A Novel Clustering Data Based on K-Means

Abstract Views: 278  |  PDF Views: 2

Authors

Swapna Sunkara
Department of CSE, SBCE, Khammam, India
K. Nageswara Rao
Department of CSE, Mother Teresa Institute of Science and Technology, Sattupally, India
Upendar Para
Department of CSE, KITS, Khammam, India
Shaik Nagasaidulu
Department of CSE, KITS, Khammam, India

Abstract


In this paper a new algorithm for clustering symbolic data based on K-Means algorithm is proposed. This new algorithm allows the data entry and the membership degree to be intervals. In our approach, we propose a dynamic document clustering based on structured MARDL technique. In this method, each document is assigned a weight by term frequency and inverse document frequency method using cosine similarity measure and then, the documents are first separated into clusters using k-Means method. The largest cluster will split and forms two sub clusters and this step would be repeated for many times until clusters formed are with high similarity. In addition, our approach tends to capture the intrinsic structure of a data set, e.g., the number of clusters. Simulation results demonstrate that our approach yields favorite results for a variety of temporal data clustering tasks. As our weighted cluster ensemble algorithm can combine any input partitions to generate a clustering ensemble, we also investigate its limitation by formal analysis and empirical studies.

Keywords


Clustering, K-Means, MARDAL.