Open Access Open Access  Restricted Access Subscription Access

Comparative Study for Prediction of Low and High Plasma Protein Binding Drugs by Various Machine Learning-Based Classification Algorithms


Affiliations
1 School of Life Sciences, Jaipur National University, Jaipur - 302025, Rajasthan, India
2 Birla Institute of Applied Sciences, Bhimtal, Nainital - 263136, Uttarakhand, India
3 National Centre for Cell Science, NCCS Complex, Pune University Campus, Pune - 411007, Maharashtra, India
 

In the drug discovery path, most drug candidates failed at the early stages due to their pharmacokinetic behavior in the system. Early prediction of pharmacokinetic properties and screening methods can reduce the time and investment for lead discoveries. Plasma protein binding is one of these properties which has a vital role in drug discovery and development. The focus of the current study is to develop a computational model for the classification of Low Plasma Protein Binding (LPPB) and High Plasma Protein Binding (HPPB) drugs using machine learning methods for early screening of molecules through WEKA. Plasma protein binding drugs data was collated from the Drug Bank database where 617 drug candidates were found to interact with plasma proteins, out of which an equal proportion of high and low plasma protein binding drugs were extracted to build a training set of ~300 drugs. The machine learning algorithms were trained with a training set and evaluated by a test set. We also compared various machine learning-based classification algorithms i.e., the Naïve Bayes algorithm, Instance-Based Learner (IBK), multilayer perceptron, and random forest to determine the best model based on accuracy. It was observed that the random forest algorithm-based model outperforms with an accuracy of 99.67% and 0.9933 kappa value on training set and on test set as compared to other classification methods and can predict drug plasma binding capacity in the given data set using the WEKA tool.

Keywords

Drug Discovery, Machine Learning, Multilayer Perceptron, Pharmacokinetic Plasma Protein Binding, Random Forest
User
Notifications
Font Size


  • Comparative Study for Prediction of Low and High Plasma Protein Binding Drugs by Various Machine Learning-Based Classification Algorithms

Abstract Views: 414  |  PDF Views: 171

Authors

Sumit Govil
School of Life Sciences, Jaipur National University, Jaipur - 302025, Rajasthan, India
Sandesh Tripathi
Birla Institute of Applied Sciences, Bhimtal, Nainital - 263136, Uttarakhand, India
Amit Kumar
School of Life Sciences, Jaipur National University, Jaipur - 302025, Rajasthan, India
Divya Shrivastava
School of Life Sciences, Jaipur National University, Jaipur - 302025, Rajasthan, India
Shailesh Kumar
National Centre for Cell Science, NCCS Complex, Pune University Campus, Pune - 411007, Maharashtra, India

Abstract


In the drug discovery path, most drug candidates failed at the early stages due to their pharmacokinetic behavior in the system. Early prediction of pharmacokinetic properties and screening methods can reduce the time and investment for lead discoveries. Plasma protein binding is one of these properties which has a vital role in drug discovery and development. The focus of the current study is to develop a computational model for the classification of Low Plasma Protein Binding (LPPB) and High Plasma Protein Binding (HPPB) drugs using machine learning methods for early screening of molecules through WEKA. Plasma protein binding drugs data was collated from the Drug Bank database where 617 drug candidates were found to interact with plasma proteins, out of which an equal proportion of high and low plasma protein binding drugs were extracted to build a training set of ~300 drugs. The machine learning algorithms were trained with a training set and evaluated by a test set. We also compared various machine learning-based classification algorithms i.e., the Naïve Bayes algorithm, Instance-Based Learner (IBK), multilayer perceptron, and random forest to determine the best model based on accuracy. It was observed that the random forest algorithm-based model outperforms with an accuracy of 99.67% and 0.9933 kappa value on training set and on test set as compared to other classification methods and can predict drug plasma binding capacity in the given data set using the WEKA tool.

Keywords


Drug Discovery, Machine Learning, Multilayer Perceptron, Pharmacokinetic Plasma Protein Binding, Random Forest

References