Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Early Onset Detection of Diabetes Using Feature Selection and Boosting Techniques


Affiliations
1 Department of Computer Science and Engineering, Sri Venkateswara College of Engineering, India
2 Department of MCA, DG Vaishnav College, India
     

   Subscribe/Renew Journal


Diabetes is one of the most common diseases present in human beings. It is well known that diabetes is a metabolic disease with no permanent cure but on early detection longevity can be increased. This research work focuses on predicting the early onset of diabetes. The diabetic dataset from UCI Machine Learning Repository is used. The necessary preprocessing techniques have been carried out to make the data more robust and suiTable.for further processing. This research work proposes two feature selection and ensemble boosting techniques resulting in four combinations (models) to predict the presence of diabetes in persons. Also, a novelty is introduced in further reducing the number of features selected by the feature selection techniques. The reduction in the number of features will reduce the memory and time complexity of the model. Among the models proposed, Light Gradient Boosting (LightGBM) with Recursive Feature Elimination (RFE) as feature selector has produced better performance. Further, LightGBM with least features gave satisfactory results.

Keywords

Data Mining, Boosting, Medical Mining, Diabetes, Feature Selection.
Subscription Login to verify subscription
User
Notifications
Font Size


  • Early Onset Detection of Diabetes Using Feature Selection and Boosting Techniques

Abstract Views: 391  |  PDF Views: 1

Authors

Shruti Srivatsan
Department of Computer Science and Engineering, Sri Venkateswara College of Engineering, India
T. Santhanam
Department of MCA, DG Vaishnav College, India

Abstract


Diabetes is one of the most common diseases present in human beings. It is well known that diabetes is a metabolic disease with no permanent cure but on early detection longevity can be increased. This research work focuses on predicting the early onset of diabetes. The diabetic dataset from UCI Machine Learning Repository is used. The necessary preprocessing techniques have been carried out to make the data more robust and suiTable.for further processing. This research work proposes two feature selection and ensemble boosting techniques resulting in four combinations (models) to predict the presence of diabetes in persons. Also, a novelty is introduced in further reducing the number of features selected by the feature selection techniques. The reduction in the number of features will reduce the memory and time complexity of the model. Among the models proposed, Light Gradient Boosting (LightGBM) with Recursive Feature Elimination (RFE) as feature selector has produced better performance. Further, LightGBM with least features gave satisfactory results.

Keywords


Data Mining, Boosting, Medical Mining, Diabetes, Feature Selection.

References