Open Access Open Access  Restricted Access Subscription Access

An Approach for Diabetes Detection using Data Mining Classification Techniques


Affiliations
1 IKG Punjab Technical University, Jalandhar, Punjab, India
2 Beant College of Engineering and Technology, Gurdaspur, Punjab, India
3 PEC University of Technology, Chandigarh, India
 

Disease diagnose by expert systems, is one of the areas where tools of data mining are establishing successful results. The aim of this paper is to discover solutions for diagnosing the disease by analyzing the patterns found in the data through techniques of data mining like classification analysis. Classification is a common technique used in data mining that utilizes a set of pre-classified examples for developing a model that can help in classifying the population of records at enormous amount. There are various techniques of classification that are used for analysis of biomedical data. These include Naive Bayes, Bayes Net, J48, SMO, and Random Forest. In this paper, the comparison of different classification algorithms using Weka has been shown. Also these techniques are used to find out which algorithm is most suitable. The best algorithm based on the Cross validation is SMO classifier with an accuracy of 77.34 % and has the lowest average error at 22.65 % compared to others. The best algorithm based on the Percentage split, Decision Table classifier with accuracy of 81.99 % and has the lowest average error at 18.00 % compared to others.

Keywords

Data Mining, Bioinformatics, Data Mining Techniques, Weka, Diabetes.
User
Notifications
Font Size

  • David Satish Kumar , Amr T.M Saeb, Khalid AI Rubeaan , “Comparative Analysis of Data Mining Tools and Classification Techniques using Weka in Medical Bioinformatics”, Computer Engineering and Intelligent Systems (2013) Vol. 4, No.3, pp. 28-38.
  • Aher Sunita B, L.M.R.J LOBO, “Data Mining in Educational System using Weka,” International Conference on Emerging Technology Trends (2011) pp.20-25.
  • Gangwar Vivek, Singh Yogendra, Ghose Udayan, “Data mining of Biological Data in Bioinformatics using Transcription ,translation Algorithm and Pattern Matching of Protein Sequences”, International Journal of Advanced Research In Computer Science (2012) Vol.3, No.3, pp.479-482.
  • Mohammed J. Zaki, George Karypis, Jiong Yang, “Data Mining in Bioinformatics”, Algorithms for molecular biology (2007), Vol.2, No.4.
  • Han J , “How can data mining help bio-data analysis”, 2nd International Conference on Data Mining in Bioinformatics (2002) Springer-Verlag, pp. 1-2.
  • Lyer Aiswarya , Jeyalatha S, Sumbaly Ronak, “Diagnosis of diabetes using classification mining techniques”, International Journal of Data Mining & Knowledge Management Process (2015) Vol.5, No.1, pp.1-14.
  • Bedi Rajni, Sharma Ajay Shiv, “Classification Algorithms for Prediction of Lumbar Spine Pathologies”, Springer, ICAICR (2017), pp. 42-50.
  • Saini Nisha, Monica, Kumar Vijay, Kumbhar S, “Churn Prediction in Telecommunication Using Classification Techniques Based on Data Mining: A Survey”, International Journal of Advanced Research in Computer Science and Software Engineering, (2015) Vol. 5, No. 3.
  • Salama I Gouda, Abdelhalim M. B, Zeid Magdy Abd-elghany “ Breast Cancer Diagnosis on Three Different Datasets Using Multi-Classifiers”, International journal of Computer and Information Technology (2012), Vol.1, No.1, pp.36-43.
  • Amin Md. Nurul, Habib Md. Ahsan, “Comparison of Different Classification Techniques Using WEKA for Hematological Data”, American Journal of Engineering Research (2015) Vol. 4, No. 3, pp. 55-61.

Abstract Views: 203

PDF Views: 0




  • An Approach for Diabetes Detection using Data Mining Classification Techniques

Abstract Views: 203  |  PDF Views: 0

Authors

Sonu Bala Garg
IKG Punjab Technical University, Jalandhar, Punjab, India
Ajay Kumar Mahajan
Beant College of Engineering and Technology, Gurdaspur, Punjab, India
T. S. Kamal
PEC University of Technology, Chandigarh, India

Abstract


Disease diagnose by expert systems, is one of the areas where tools of data mining are establishing successful results. The aim of this paper is to discover solutions for diagnosing the disease by analyzing the patterns found in the data through techniques of data mining like classification analysis. Classification is a common technique used in data mining that utilizes a set of pre-classified examples for developing a model that can help in classifying the population of records at enormous amount. There are various techniques of classification that are used for analysis of biomedical data. These include Naive Bayes, Bayes Net, J48, SMO, and Random Forest. In this paper, the comparison of different classification algorithms using Weka has been shown. Also these techniques are used to find out which algorithm is most suitable. The best algorithm based on the Cross validation is SMO classifier with an accuracy of 77.34 % and has the lowest average error at 22.65 % compared to others. The best algorithm based on the Percentage split, Decision Table classifier with accuracy of 81.99 % and has the lowest average error at 18.00 % compared to others.

Keywords


Data Mining, Bioinformatics, Data Mining Techniques, Weka, Diabetes.

References