Analysis of Heuristic Measures for Cluster Split in Bisecting K-Means

Y. Sri Lalitha; A. Govardhan

Analysis of Heuristic Measures for Cluster Split in Bisecting K-Means

Affiliations
1 Department of CSE, Gokaraju Rangaraju Institute of Engineering and Technology, India
2 Department of CSE, Jawaharlal Nehru University and Technology, Hyderabad, India

Subscribe/Renew Journal

With ever increasing number of documents on web and other repositories, the task of organizing and categorizing these documents to the diverse need of the user by manual means is a complicated job, hence a machine learning technique named clustering is very useful. This paper proposes work is based on shared neighbors. Two documents are said to be neighbors of each other when their similarity is greater than a threshold. Here we choose to work with bisecting k-means in which cluster quality depends on choosing a cluster to be split till k clusters are formed. The automatic selection of cluster to be split is difficult and time consuming in text documents due to its high dimensionality. This paper implements Bisecting k-means a text document clustering technique to analyze the best criteria needed to select a cluster to be split. We have compared our results with the ones proposed in literature and our observed that our experimental results showed promising results when tested on real life data sets.

Keywords

Text Clustering, Similarity Measures, Coherent Clustering, Splitting Criteria.

I-Scholar

Journal Help

User

Subscription Login to verify subscription

Notifications

Journal Content
Browse

Font Size

Information

Data Mining and Knowledge Engineering

Analysis of Heuristic Measures for Cluster Split in Bisecting K-Means

Subscribe/Renew Journal

Keywords

Analysis of Heuristic Measures for Cluster Split in Bisecting K-Means

Authors

Abstract

Keywords

Username
Password
Remember me

Username
Password
Remember me