Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques.

K. Sharmila; S. A. Vetha Manickam

doi:10.17485/ijst/2016/v9i40/126204

Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques.

K. Sharmila , S. A. Vetha Manickam

Affiliations
1 Department of Computer Science, Vels University, Chennai - 600117, Tamil Nadu, India

Abstract
References
Article Metrics
Refbacks

Objectives: The articles display how enormous measure of information in the field of social insurance frameworks can be dissected utilizing grouping method. Removing helpful data from this gigantic measure of information is profoundly compound, exorbitant, and tedious, in such territory information mining can assume a key part. Specifically, the standard information digging calculations for the examination of colossal information volumes can be parallelized for speedier preparing. Methods/Statistical Analysis: This paper concentrate on how grouping calculation to be specific K-means can be utilized as a part of parallel handling stage in particular Apache Hadoop bunch (MapReduce paradigm huge) so as to dissect the gigantic information quicker. Findings: As an early point, we complete examination keeping in mind the end goal to evaluate the adequacy of the parallel preparing stages as far as execution. Applications/Improvements: Based on the final result, it shows that Apache Hadoop with K-means cluster is a promising example for versatile execution to anticipate and analyze the diabetic infections from huge measure of information. The proposed work will give an insight about the big data prediction of diabetic dataset through Hadoop. In future this technology has to be extended on cloud so as to connect various geographic districts around Tamil Nadu to predict diabetic related diseases.