Open Access Open Access  Restricted Access Subscription Access

Analysis of Depth of Entropy and GINI Index Based Decision Trees for Predicting Diabetes


Affiliations
1 Assistant Professor, with Central Department of Computer Science and IT, Tribhuvan University, Kathmandu, Bagmati - 44613, Nepal
2 Professor with Department of Electronics and Computer Engineering, IOE, Tribhuvan University, Lalitpur, Bagmati - 44700, Nepal
3 Lecturer with Nagarjuna College of IT, Tribhuvan University, Lalitpur, Bagmati - 44700, Nepal

   Subscribe/Renew Journal


Diabetes is a disease caused due to malfunctioning of pancreas. In this disease, pancreas either no longer produces insulin or produces insufficient insulin. If this disease is diagnosed early, several health complications can be avoided by taking precautions timely. Otherwise, it may create serious health problems. Nowadays, machine learning models are widely researched for diabetes prediction. This research work uses decision tree classifiers for diabetes prediction and analyzed the impact of decision tree depth in diabetes prediction. Besides this, the research work compared the performances of ID3 and CART decision trees in reference to diabetes prediction. From the empirical observation, we concluded that the CART algorithm has slightly better performance than ID3 and the best prediction performance can be achieved with the decision trees of depth 4.

Keywords

CART, Decision Tree Classification, Depth Analysis, Diabetes Prediction, ID3.
User
Subscription Login to verify subscription
Notifications
Font Size

  • J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techn., 3rd ed. Burlington,USA: Morgan Kaufmann, 2011.
  • C.- H. Weng, T. C.-K. Huang, and R.-P. Han, “Disease prediction with different types of neural network classifiers,” Telemat. Informatics, vol. 33, no. 2, pp. 277–292, 2016.
  • D. Sisodia and D. S. Sisodia, “Prediction of diabetes using classification algorithms,” Procedia Comput. Sci., v o l . 1 3 2 , p p . 1 5 7 8 – 1 5 8 5 , 2 0 1 8 , d o i : 10.1016/j.procs.2018.05.122
  • I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (Adaptive Computation and Mach. Learning Series). Massachusetts: The MIT Press, 2016.
  • S. Yuvarani and R. Selvarani, “An analysis of decision tree models for diabetes,” Int. Res. J. Eng. Technol., vol. 3, no. 11, pp. 680–684, 2016. [Online]. Available: https://www.irjet.net/archives/V3/i11/IRJETV3I11118. pdf
  • J. Han, J. C. Rodriguez, and M. Beheshti, “Diabetes data analysis and prediction model discovery using rapidminer,” in 2nd Int. Conf. Future Generation Communication and Networking, Hainan, China, Dec. 13-15, 2008, pp. 96–99, doi: 10.1109/FGCN.2008.226.
  • A. A. Al Jarullah, “Decision tree discovery for the diagnosis of type II diabetes,” in Int. Conf. Innovations Inform. Technol., Apr. 25-27, 2011, pp. 303–307, doi: 10.1109/INNOVATIONS.2011.5893838
  • W. Chen, S. Chen, H. Zhang, and T. Wu, “A hybrid prediction model for type 2 diabetes using K-means and decision tree,” in 2017 8th IEEE Int. Conf. Software Eng. and Service Sci., 2017, pp. 386–390, doi: 0.1109/ICSESS.2017.8342938
  • J. P. Kandhasamy and S. Balamurali, “Performance analysis of classifier models to predict Diabetes Mellitus,” Procedia Comput. Sci., vol. 47, pp. 45–51, 2015, doi: 10.1016/j.procs.2015.03.182
  • X.-H. Meng, Y.-X. Huang, D.-P. Rao, Q. Zhang, and Q. Liu, “Comparison of three data mining models for predicting diabetes or prediabetes by risk factors,” The Kaohsiung J. Med. Sci., vol. 29, pp. 93–99, Feb. 2013, doi: 10.1016/j.kjms.2012.08.016
  • M. T. M. K. Sabariah, S. T. A. Hanifa, and M. T. S. Sa’adah, “Early detection of type II Diabetes Mellitus with random forest and classification and regression tree (CART),” in Int. Conf. Advanced Informatics: Concept, Theory and Application (ICAICTA), Bandung, Indonesia, 2014,pp.238–242 , d o i : 10.1109/ICAICTA.2014.7005947
  • Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Front. Genet., vol. 9, Nov. 2018, doi: 10.3389/fgene.2018.00515
  • W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of support vector machine modeling for prediction of common diseases: The case of diabetes and pre-diabetes,” BMC Med. Inform. Decis. Mak., vol. 10, no. 1, Art. no. 16, 2010, Art no 16, doi: 10.1186/1472-6947-10-16
  • V. Vijayan V. and A. Ravikumar, “Study of data mining algorithms for prediction and diagnosis of Diabetes Mellitus,” Int. J. Comput. Appl., vol. 95, no. 17, pp. 12–16, Jun. 2014. [Online]. Available: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.670.9608&rep=rep1&type=pdf
  • F. Huang, S. Wang, and C.-C. Chan, “Predicting disease by using data mining based on healthcare information system,” in 2012 IEEE Int. Conf. Granular Computing, 2012, pp. 191 – 194 , doi: 10.1109/GrC.2012.6468691
  • N. Sneha and T. Gangil, “Analysis of diabetes mellitus for early prediction using optimal features selection,” J. Big Data, vol. 6, no. 1, 2019, Art. no. 13, doi: 10.1186/s40537-019-0175-6
  • K. Polat, S. Güneş and A. Arslan, “A cascade learning system for classification of diabetes disease: Generalized Discriminant Analysis and Least Square Support Vector Machine,” Expert Syst. Appl., vol. 34, no.1, pp. 482 – 487, Jan. 2008, doi : 10.1016/j.eswa.2006.09.012
  • D. Çalişir and E. Doğantekin, “An automatic diabetes diagnosis system based on LDA-Wavelet Support Vector Machine Classifier,” Expert Syst. Appl., vol. 38, no. 7, pp. 8311–8315, Jul. 2011, doi: 10.1016/j.eswa.2011.01.017

Abstract Views: 177

PDF Views: 0




  • Analysis of Depth of Entropy and GINI Index Based Decision Trees for Predicting Diabetes

Abstract Views: 177  |  PDF Views: 0

Authors

Arjun Singh Saud
Assistant Professor, with Central Department of Computer Science and IT, Tribhuvan University, Kathmandu, Bagmati - 44613, Nepal
Subarna Shakya
Professor with Department of Electronics and Computer Engineering, IOE, Tribhuvan University, Lalitpur, Bagmati - 44700, Nepal
Bindu Neupane
Lecturer with Nagarjuna College of IT, Tribhuvan University, Lalitpur, Bagmati - 44700, Nepal

Abstract


Diabetes is a disease caused due to malfunctioning of pancreas. In this disease, pancreas either no longer produces insulin or produces insufficient insulin. If this disease is diagnosed early, several health complications can be avoided by taking precautions timely. Otherwise, it may create serious health problems. Nowadays, machine learning models are widely researched for diabetes prediction. This research work uses decision tree classifiers for diabetes prediction and analyzed the impact of decision tree depth in diabetes prediction. Besides this, the research work compared the performances of ID3 and CART decision trees in reference to diabetes prediction. From the empirical observation, we concluded that the CART algorithm has slightly better performance than ID3 and the best prediction performance can be achieved with the decision trees of depth 4.

Keywords


CART, Decision Tree Classification, Depth Analysis, Diabetes Prediction, ID3.

References





DOI: https://doi.org/10.17010/ijcs%2F2021%2Fv6%2Fi6%2F167641