A Survey on Efficient Big Data Clustering Using MapReduce

Avinash Dhanshetti; Tushar Rane

A Survey on Efficient Big Data Clustering Using MapReduce

Affiliations
1 Pune Institute of Computer Technology, Pune, Maharashtra, India
2 Information Technology Department, Pune Institute of Computer Technology, Pune, Maharashtra, India

Subscribe/Renew Journal

Abstract
References
Article Metrics
Refbacks

Clustering analysis is key point used by data processing algorithms in Data Mining. The primary aim of Clustering is to segment the data into more diminutive subsets called clusters, such that the data belonging to the same cluster are similar with some similarity metric. Clustering is imperative idea in data investigation and data mining applications. Over years, K-means has been popular clustering algorithm because of its ease of use and simplicity. Now days, as data size is continuously increasing, some researchers started working over distributed environment such as MapReduce to get high performance for big data clustering. In this paper, we explore the current works on efficient big data clustering algorithm using MapReduce framework.