A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest

Ha Van Sang; Nguyen Ha Nam; Nguyen Duc Nhan

doi:10.17485/ijst/2016/v9i20/133289

A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest

Ha Van Sang ¹, Nguyen Ha Nam ², Nguyen Duc Nhan ³

Affiliations
1 Department of Economic Information System, Academy of Finance, Hanoi, Viet Nam
2 Department of Information Technology, VNU-University of Engineering and Technology, Hanoi, Viet Nam
3 Department, Faculty of Telecommunications, Posts and Telecommunications Institute of Technology, Hanoi, Viet Nam

Abstract
References
Article Metrics
Refbacks

Background/Objectives: This article presents a method of feature selection to improve the accuracy and the computation speed of credit scoring models. Methods/Analysis: In this paper, we proposed a credit scoring model based on parallel Random Forest classifier and feature selection method to evaluate the credit risks of applicants. By integration of Random Forest into feature selection process, the importance of features can be accurately evaluated to remove irrelevant and redundant features. Findings: In this research, an algorithm to select best features was developed by using the best average and median scores and the lowest standard deviation as the rules of feature scoring. Consequently, the dimension of features can be reduced to the smallest possible number that allows of a remarkable runtime reduction. Thus the proposed model can perform feature selection and model parameters optimization at the same time to improve its efficiency. The performance of our proposed model was experimentally assessed using two public datasets which are Australian and German datasets. The obtained results showed that an improved accuracy of the proposed model compared to other commonly used feature selection methods. In particular, our method can attain the average accuracy of 76.2% with a significantly reduced running time of 72 minutes on German credit dataset and the highest average accuracy of 89.4% with the running time of only 50 minutes on Australian credit dataset. Applications/Improvements: This method can be usefully applied in credit scoring models to improve accuracy with a significantly reduced runtime.

Keywords

Credit Scoring, Feature Selection, Machine Learning, and Parallel Random Forest.

About the Journal

Editorial Board

Current Issue

Archives

Advanced Search

Article Submission

Registration

Subscription

User

Information

Journal Content
Browse

Donations

Abstract Views: 195

PDF Views: 0

A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest

Abstract Views: 195 | PDF Views: 0

Authors

Ha Van Sang
Department of Economic Information System, Academy of Finance, Hanoi, Viet Nam

Nguyen Ha Nam
Department of Information Technology, VNU-University of Engineering and Technology, Hanoi, Viet Nam

Nguyen Duc Nhan
Department, Faculty of Telecommunications, Posts and Telecommunications Institute of Technology, Hanoi, Viet Nam

Abstract

Keywords

Credit Scoring, Feature Selection, Machine Learning, and Parallel Random Forest.

DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i20%2F133289

Username
Password
Remember me

Username
Password
Remember me

Indian Journal of Science and Technology

A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest

Keywords

A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest

Authors

Abstract

Keywords