Open Access Open Access  Restricted Access Subscription Access

Clustering High Dimensional Data Using Subspace and Projected Clustering Algorithms


Affiliations
1 Universiti Malaysia Pahang, Lebuhraya Tun Razak, 26300, Kuantan, Pahang Darul Makmur, Malaysia
 

Problem statement: Clustering has a number of techniques that have been developed in statistics, pattern recognition, data mining, and other fields. Subspace clustering enumerates clusters of objects in all subspaces of a dataset. It tends to produce many over lapping clusters. Approach: Subspace clustering and projected clustering are research areas for clustering in high dimensional spaces. In this research we experiment three clustering oriented algorithms, PROCLUS, P3C and STATPC. Results: In general, PROCLUS performs better in terms of time of calculation and produced the least number of un-clustered data while STATPC outperforms PROCLUS and P3C in the accuracy of both cluster points and relevant attributes found. Conclusions/Recommendations: In this study, we analyze in detail the properties of different data clustering method.

Keywords

Clustering, Projected Clustering, Subspace Clustering, Clustering Oriented, PROCLUS, P3C, STATPC.
User
Notifications
Font Size

Abstract Views: 272

PDF Views: 159




  • Clustering High Dimensional Data Using Subspace and Projected Clustering Algorithms

Abstract Views: 272  |  PDF Views: 159

Authors

Rahmat Widia Sembiring
Universiti Malaysia Pahang, Lebuhraya Tun Razak, 26300, Kuantan, Pahang Darul Makmur, Malaysia
Jasni Mohamad Zain
Universiti Malaysia Pahang, Lebuhraya Tun Razak, 26300, Kuantan, Pahang Darul Makmur, Malaysia
Abdullah Embong
Universiti Malaysia Pahang, Lebuhraya Tun Razak, 26300, Kuantan, Pahang Darul Makmur, Malaysia

Abstract


Problem statement: Clustering has a number of techniques that have been developed in statistics, pattern recognition, data mining, and other fields. Subspace clustering enumerates clusters of objects in all subspaces of a dataset. It tends to produce many over lapping clusters. Approach: Subspace clustering and projected clustering are research areas for clustering in high dimensional spaces. In this research we experiment three clustering oriented algorithms, PROCLUS, P3C and STATPC. Results: In general, PROCLUS performs better in terms of time of calculation and produced the least number of un-clustered data while STATPC outperforms PROCLUS and P3C in the accuracy of both cluster points and relevant attributes found. Conclusions/Recommendations: In this study, we analyze in detail the properties of different data clustering method.

Keywords


Clustering, Projected Clustering, Subspace Clustering, Clustering Oriented, PROCLUS, P3C, STATPC.