Open Access Open Access  Restricted Access Subscription Access

Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques.


Affiliations
1 Department of Computer Science, Vels University, Chennai - 600117, Tamil Nadu, India
 

Objectives: The articles display how enormous measure of information in the field of social insurance frameworks can be dissected utilizing grouping method. Removing helpful data from this gigantic measure of information is profoundly compound, exorbitant, and tedious, in such territory information mining can assume a key part. Specifically, the standard information digging calculations for the examination of colossal information volumes can be parallelized for speedier preparing. Methods/Statistical Analysis: This paper concentrate on how grouping calculation to be specific K-means can be utilized as a part of parallel handling stage in particular Apache Hadoop bunch (MapReduce paradigm huge) so as to dissect the gigantic information quicker. Findings: As an early point, we complete examination keeping in mind the end goal to evaluate the adequacy of the parallel preparing stages as far as execution. Applications/Improvements: Based on the final result, it shows that Apache Hadoop with K-means cluster is a promising example for versatile execution to anticipate and analyze the diabetic infections from huge measure of information. The proposed work will give an insight about the big data prediction of diabetic dataset through Hadoop. In future this technology has to be extended on cloud so as to connect various geographic districts around Tamil Nadu to predict diabetic related diseases.

Keywords

Apache Hadoop, K-means, MapReduce.
User

Abstract Views: 138

PDF Views: 0




  • Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques.

Abstract Views: 138  |  PDF Views: 0

Authors

K. Sharmila
Department of Computer Science, Vels University, Chennai - 600117, Tamil Nadu, India
S. A. Vetha Manickam
Department of Computer Science, Vels University, Chennai - 600117, Tamil Nadu, India

Abstract


Objectives: The articles display how enormous measure of information in the field of social insurance frameworks can be dissected utilizing grouping method. Removing helpful data from this gigantic measure of information is profoundly compound, exorbitant, and tedious, in such territory information mining can assume a key part. Specifically, the standard information digging calculations for the examination of colossal information volumes can be parallelized for speedier preparing. Methods/Statistical Analysis: This paper concentrate on how grouping calculation to be specific K-means can be utilized as a part of parallel handling stage in particular Apache Hadoop bunch (MapReduce paradigm huge) so as to dissect the gigantic information quicker. Findings: As an early point, we complete examination keeping in mind the end goal to evaluate the adequacy of the parallel preparing stages as far as execution. Applications/Improvements: Based on the final result, it shows that Apache Hadoop with K-means cluster is a promising example for versatile execution to anticipate and analyze the diabetic infections from huge measure of information. The proposed work will give an insight about the big data prediction of diabetic dataset through Hadoop. In future this technology has to be extended on cloud so as to connect various geographic districts around Tamil Nadu to predict diabetic related diseases.

Keywords


Apache Hadoop, K-means, MapReduce.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i40%2F126204