On Sampling Techniques for Corporate Credit Scoring

Hung Ba Nguyen; Van-Nam Huynh

doi:10.20965/jaciii.2020.p0048

Abstract

The imbalanced dataset is a crucial problem found in many real-world applications. Classifiers trained on these datasets tend to overfit toward the majority class, and this problem severely affects classifier accuracy. This ultimately triggers a large cost to cover the error in terms of misclassifying the minority class especially in credit-granting decision when the minority class is the bad loan applications. By comparing the industry standard with well-known machine learning and ensemble models under imbalance treatment approaches, this study shows the potential performance of these models towards the industry standard in credit scoring. More importantly, diverse performance measurements reveal different weaknesses in various aspects of a scoring model. Employing class balancing strategies can mitigate classifier errors, and both homogeneous and heterogeneous ensemble approaches yield the best significant improvement on credit scoring.

Content from these authors

This article cannot obtain the latest cited-by information.

Favorites & Alerts

Corresponding author

Correction information

Register with J-STAGE for free!