Open Access
Subscription Access
Open Access
Subscription Access
Analysis of Heuristic Measures for Cluster Split in Bisecting K-Means
Subscribe/Renew Journal
With ever increasing number of documents on web and other repositories, the task of organizing and categorizing these documents to the diverse need of the user by manual means is a complicated job, hence a machine learning technique named clustering is very useful. This paper proposes work is based on shared neighbors. Two documents are said to be neighbors of each other when their similarity is greater than a threshold. Here we choose to work with bisecting k-means in which cluster quality depends on choosing a cluster to be split till k clusters are formed. The automatic selection of cluster to be split is difficult and time consuming in text documents due to its high dimensionality. This paper implements Bisecting k-means a text document clustering technique to analyze the best criteria needed to select a cluster to be split. We have compared our results with the ones proposed in literature and our observed that our experimental results showed promising results when tested on real life data sets.
Keywords
Text Clustering, Similarity Measures, Coherent Clustering, Splitting Criteria.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 272
PDF Views: 3