A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Inuwa-Dutse, Isa
- A Novel Algorithm for Clustering High Dimensional Data
Authors
1 Institute of Information Science, Beijing Jiaotong University, Beijing 100044, CN
Source
Artificial Intelligent Systems and Machine Learning, Vol 11, No 8 (2019), Pagination: 141-144Abstract
The Challenges of Cluster Analysis and Related Work K-means is one of the most commonly used clustering algorithm, but it does not perform well on data with outliers or with clusters of different sizes or non-globular shapes. The single link agglomerative clustering method is the most suitable for capturing clusters with non-globular shapes, but this approach is very sensitive to noise and cannot handle clusters of varying density. However, most of the clustering challenges, particularly those related to “quality,” rather than computational resources, are the same challenges that existed decades ago: how to find clusters with differing sizes, shapes and densities, how to handle noise and outliers, and how to determine the number of clusters. The general idea of our novel subspace outlier model is to analyze for each point, how well it fits to the subspace that is spanned by a set of reference points. The experimental evaluation showed that proposed method can find more interesting and more meaningful outliers in high dimensional data with higher accuracy than full dimensional outlier models by no additional computational costs.