Evaluation of Cost Sensitive Learning for Imbalanced Bank Direct Marketing Data

Khor Kok-Chin; Ng Keng-Hoong

doi:10.17485/ijst/2016/v9i42/123949

Evaluation of Cost Sensitive Learning for Imbalanced Bank Direct Marketing Data

Khor Kok-Chin , Ng Keng-Hoong

Affiliations
1 Faculty of Computing and Informatics, Multimedia University, 63100, Cyberjaya, Selangor, Malaysia

Abstract
References
Article Metrics
Refbacks

Objectives: The imbalanced bank direct marketing data set utilized in this study is a two-class data mining problem, where a customer may or may not subscribe a product from a bank. Methods/Statistical Analysis: The data set inherited the rare class problem where the classification rate attained for the rare class is low. In this study, we attempted cost sensitive learning to mitigate the problem, and to address that there are various costs involved when misclassification occurs. Three learning algorithms, namely, Naive Bayes (NB), C4.5 and Naive Bayes Tree (NBT) were involved in the cost sensitive learning and their results were empirically evaluated. Findings: The results were also compared with two previous studies that utilized the cost insensitive SVM and over-sampling, respectively. Although cost sensitive learning is claimed able to handle imbalanced data sets, but we noticed that the learning is less effective for the bank direct marketing data set in overall. Cost sensitive learning provides a way of “wrapping” learning algorithms that are not designed to handle imbalanced class distributions. Therefore, it may not work well for certain imbalanced data sets. Over-sampling, on the other hand, worked well for the data set. Improvements/Applications: Over-sampling helped to generalize the decision region of the rare class clearly and subsequently improved the classification result.

Keywords

Bank Direct Marketing, Cost Sensitive Learning, Imbalanced Data Set, Rare Class Problem, Over-Sampling.

About the Journal

Editorial Board

Current Issue

Archives

Advanced Search

Article Submission

Registration

Subscription

User

Information

Journal Content
Browse

Donations

Abstract Views: 219

PDF Views: 0

Evaluation of Cost Sensitive Learning for Imbalanced Bank Direct Marketing Data

Abstract Views: 219 | PDF Views: 0

Authors

Khor Kok-Chin
Faculty of Computing and Informatics, Multimedia University, 63100, Cyberjaya, Selangor, Malaysia

Ng Keng-Hoong
Faculty of Computing and Informatics, Multimedia University, 63100, Cyberjaya, Selangor, Malaysia

Abstract

Keywords

Bank Direct Marketing, Cost Sensitive Learning, Imbalanced Data Set, Rare Class Problem, Over-Sampling.

DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i42%2F123949

Username
Password
Remember me

Username
Password
Remember me

Indian Journal of Science and Technology

Evaluation of Cost Sensitive Learning for Imbalanced Bank Direct Marketing Data

Keywords

Evaluation of Cost Sensitive Learning for Imbalanced Bank Direct Marketing Data

Authors

Abstract

Keywords