Open Access
Subscription Access
Analysis of Depth of Entropy and GINI Index Based Decision Trees for Predicting Diabetes
Subscribe/Renew Journal
Diabetes is a disease caused due to malfunctioning of pancreas. In this disease, pancreas either no longer produces insulin or produces insufficient insulin. If this disease is diagnosed early, several health complications can be avoided by taking precautions timely. Otherwise, it may create serious health problems. Nowadays, machine learning models are widely researched for diabetes prediction. This research work uses decision tree classifiers for diabetes prediction and analyzed the impact of decision tree depth in diabetes prediction. Besides this, the research work compared the performances of ID3 and CART decision trees in reference to diabetes prediction. From the empirical observation, we concluded that the CART algorithm has slightly better performance than ID3 and the best prediction performance can be achieved with the decision trees of depth 4.
Keywords
CART, Decision Tree Classification, Depth Analysis, Diabetes Prediction, ID3.
User
Subscription
Login to verify subscription
Font Size
Information
- J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techn., 3rd ed. Burlington,USA: Morgan Kaufmann, 2011.
- C.- H. Weng, T. C.-K. Huang, and R.-P. Han, “Disease prediction with different types of neural network classifiers,” Telemat. Informatics, vol. 33, no. 2, pp. 277–292, 2016.
- D. Sisodia and D. S. Sisodia, “Prediction of diabetes using classification algorithms,” Procedia Comput. Sci., v o l . 1 3 2 , p p . 1 5 7 8 – 1 5 8 5 , 2 0 1 8 , d o i : 10.1016/j.procs.2018.05.122
- I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (Adaptive Computation and Mach. Learning Series). Massachusetts: The MIT Press, 2016.
- S. Yuvarani and R. Selvarani, “An analysis of decision tree models for diabetes,” Int. Res. J. Eng. Technol., vol. 3, no. 11, pp. 680–684, 2016. [Online]. Available: https://www.irjet.net/archives/V3/i11/IRJETV3I11118. pdf
- J. Han, J. C. Rodriguez, and M. Beheshti, “Diabetes data analysis and prediction model discovery using rapidminer,” in 2nd Int. Conf. Future Generation Communication and Networking, Hainan, China, Dec. 13-15, 2008, pp. 96–99, doi: 10.1109/FGCN.2008.226.
- A. A. Al Jarullah, “Decision tree discovery for the diagnosis of type II diabetes,” in Int. Conf. Innovations Inform. Technol., Apr. 25-27, 2011, pp. 303–307, doi: 10.1109/INNOVATIONS.2011.5893838
- W. Chen, S. Chen, H. Zhang, and T. Wu, “A hybrid prediction model for type 2 diabetes using K-means and decision tree,” in 2017 8th IEEE Int. Conf. Software Eng. and Service Sci., 2017, pp. 386–390, doi: 0.1109/ICSESS.2017.8342938
- J. P. Kandhasamy and S. Balamurali, “Performance analysis of classifier models to predict Diabetes Mellitus,” Procedia Comput. Sci., vol. 47, pp. 45–51, 2015, doi: 10.1016/j.procs.2015.03.182
- X.-H. Meng, Y.-X. Huang, D.-P. Rao, Q. Zhang, and Q. Liu, “Comparison of three data mining models for predicting diabetes or prediabetes by risk factors,” The Kaohsiung J. Med. Sci., vol. 29, pp. 93–99, Feb. 2013, doi: 10.1016/j.kjms.2012.08.016
- M. T. M. K. Sabariah, S. T. A. Hanifa, and M. T. S. Sa’adah, “Early detection of type II Diabetes Mellitus with random forest and classification and regression tree (CART),” in Int. Conf. Advanced Informatics: Concept, Theory and Application (ICAICTA), Bandung, Indonesia, 2014,pp.238–242 , d o i : 10.1109/ICAICTA.2014.7005947
- Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Front. Genet., vol. 9, Nov. 2018, doi: 10.3389/fgene.2018.00515
- W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of support vector machine modeling for prediction of common diseases: The case of diabetes and pre-diabetes,” BMC Med. Inform. Decis. Mak., vol. 10, no. 1, Art. no. 16, 2010, Art no 16, doi: 10.1186/1472-6947-10-16
- V. Vijayan V. and A. Ravikumar, “Study of data mining algorithms for prediction and diagnosis of Diabetes Mellitus,” Int. J. Comput. Appl., vol. 95, no. 17, pp. 12–16, Jun. 2014. [Online]. Available: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.670.9608&rep=rep1&type=pdf
- F. Huang, S. Wang, and C.-C. Chan, “Predicting disease by using data mining based on healthcare information system,” in 2012 IEEE Int. Conf. Granular Computing, 2012, pp. 191 – 194 , doi: 10.1109/GrC.2012.6468691
- N. Sneha and T. Gangil, “Analysis of diabetes mellitus for early prediction using optimal features selection,” J. Big Data, vol. 6, no. 1, 2019, Art. no. 13, doi: 10.1186/s40537-019-0175-6
- K. Polat, S. Güneş and A. Arslan, “A cascade learning system for classification of diabetes disease: Generalized Discriminant Analysis and Least Square Support Vector Machine,” Expert Syst. Appl., vol. 34, no.1, pp. 482 – 487, Jan. 2008, doi : 10.1016/j.eswa.2006.09.012
- D. Çalişir and E. Doğantekin, “An automatic diabetes diagnosis system based on LDA-Wavelet Support Vector Machine Classifier,” Expert Syst. Appl., vol. 38, no. 7, pp. 8311–8315, Jul. 2011, doi: 10.1016/j.eswa.2011.01.017
Abstract Views: 176
PDF Views: 0