The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


The research assesses the validity of a customer's appropriateness for a loan using a machine learning approach called predictive modeling. Banks and Non-Banking Financial Companies (NBFCs) are at danger of significant Non-Performing Assets (NPAs) due to customer non-payment of loans (Non-Performing Assets). The data for this study came from Kaggle, and eight different prediction models were employed to determine if the borrower would be able to repay the loan. Adaboost, κ-Nearest Neighbors (k-NN), Logistic Regression, Support Vector Machines (SVM), Decision Tree, Naive Bayes, Neural Networks, and Random Forest (RF) are the eight models, respectively. The purpose is to back up decisions made on the basis of factual evidence rather than subjective reasons. Classification Accuracy, Precision, Recall, and F-1 scores are the four performance parameters used to determine the results. With 70% and 30% respectively, the dataset is separated into train and test datasets. The whole analysis is done in two phases, with the first being a full model that is trained on 70% of the train data and the second being observed on 30% of the test data. The purpose of this study is to see how objective characteristics influence borrowers to default on loans, to identify the most common reasons for default, and to predict which customers would default. There are two evaluations we did for the research, wherein, first we took overall train set and make predictions using predictive modeling. The Adaboost predictive model delivers the greatest results, with a recall rate of 0.384, classification accuracy of 59.2 percent, true-positive rate of 69.74 percent. Second, we performed feature selection and discovered that Credit History with 31 percent had the utmost impact on loan default detection. By partitioning the dataset into Credit_History 1 and 0, we discovered that Credit History 1 produces superior results, with a rate of 0.444, 60.5 percent classification accuracy, and a true-positive rate of 68.7%.

Keywords

Adaboost, Decision Tree, κ-nearest Neighbors (κ-NN), Logistic Regression, Naïve Bayes, Neural Network, Non-Banking Financial Companies (NBFC), Support Vector Machine (SVM), Random Forest.
User
Notifications
Font Size