Open Access Open Access  Restricted Access Subscription Access

An Efficient Incremental Density based Clustering Algorithm Fused with Noise Removal and Outlier Labelling Technique


Affiliations
1 Department of Computer Science, The North Cap University, Gurgaon – 122017, Haryana, India
 

Objectives: Due to advancements in storage technology, every moderate to large sized organization is keeping huge amount of multi facet data which is growing very fast. To deal with such enormous data, we need an efficient data analysis technique like classification and clustering. Methods/Statistical Analysis: Processing high dimensional data sets in the presence of noise and outliers can degrade the performance of any kind of data analysis task. The situation can even worse; if we are going for unsupervised classification (i.e. clustering).In this paper, we proposed a new method for incremental density based clustering for high dimensional data set with reasonable speed up. The proposed method fused with noise removal and outlier labeling technique is inspired from famous box plot method. Findings: The performance analysis of fusion is done on five high dimensional data sets taken from University California Irvine (UCI) repository along four cluster evaluation metrics (F-Measure, Entropy, Purity and Speed Up).The produced clustering results confirm the effectiveness of proposed fusion. Application/Improvements: The proposed technique can be refined by hybridizing it with some metaheuristic technique for stock exchange application.

Keywords

Box Plot, Density Based Clustering, DBSCAN, , Entropy and Incremental Partitioning.
User

Abstract Views: 175

PDF Views: 0




  • An Efficient Incremental Density based Clustering Algorithm Fused with Noise Removal and Outlier Labelling Technique

Abstract Views: 175  |  PDF Views: 0

Authors

Pooja Yadav
Department of Computer Science, The North Cap University, Gurgaon – 122017, Haryana, India
Anuradha
Department of Computer Science, The North Cap University, Gurgaon – 122017, Haryana, India
Poonam Sharma
Department of Computer Science, The North Cap University, Gurgaon – 122017, Haryana, India

Abstract


Objectives: Due to advancements in storage technology, every moderate to large sized organization is keeping huge amount of multi facet data which is growing very fast. To deal with such enormous data, we need an efficient data analysis technique like classification and clustering. Methods/Statistical Analysis: Processing high dimensional data sets in the presence of noise and outliers can degrade the performance of any kind of data analysis task. The situation can even worse; if we are going for unsupervised classification (i.e. clustering).In this paper, we proposed a new method for incremental density based clustering for high dimensional data set with reasonable speed up. The proposed method fused with noise removal and outlier labeling technique is inspired from famous box plot method. Findings: The performance analysis of fusion is done on five high dimensional data sets taken from University California Irvine (UCI) repository along four cluster evaluation metrics (F-Measure, Entropy, Purity and Speed Up).The produced clustering results confirm the effectiveness of proposed fusion. Application/Improvements: The proposed technique can be refined by hybridizing it with some metaheuristic technique for stock exchange application.

Keywords


Box Plot, Density Based Clustering, DBSCAN, , Entropy and Incremental Partitioning.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i48%2F138435