Author Details

The Challenges of Cluster Analysis and Related Work K-means is one of the most commonly used clustering algorithm, but it does not perform well on data with outliers or with clusters of diﬀerent sizes or non-globular shapes. The single link agglomerative clustering method is the most suitable for capturing clusters with non-globular shapes, but this approach is very sensitive to noise and cannot handle clusters of varying density. However, most of the clustering challenges, particularly those related to “quality,” rather than computational resources, are the same challenges that existed decades ago: how to ﬁnd clusters with diﬀering sizes, shapes and densities, how to handle noise and outliers, and how to determine the number of clusters. The general idea of our novel subspace outlier model is to analyze for each point, how well it ﬁts to the subspace that is spanned by a set of reference points. The experimental evaluation showed that proposed method can ﬁnd more interesting and more meaningful outliers in high dimensional data with higher accuracy than full dimensional outlier models by no additional computational costs.

Keywords

Clustering, High-Dimensional, Nearest Neighbours, Data Points, Root Mapping.

Full Text

Username
Password
Remember me