IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study
Qiao YUShujuan JIANGYanmei ZHANG
Author information
JOURNAL FREE ACCESS

2017 Volume E100.D Issue 2 Pages 265-272

Details
Abstract

Class imbalance has drawn much attention of researchers in software defect prediction. In practice, the performance of defect prediction models may be affected by the class imbalance problem. In this paper, we present an approach to evaluating the performance stability of defect prediction models on imbalanced datasets. First, random sampling is applied to convert the original imbalanced dataset into a set of new datasets with different levels of imbalance ratio. Second, typical prediction models are selected to make predictions on these new constructed datasets, and Coefficient of Variation (C·V) is used to evaluate the performance stability of different models. Finally, an empirical study is designed to evaluate the performance stability of six prediction models, which are widely used in software defect prediction. The results show that the performance of C4.5 is unstable on imbalanced datasets, and the performance of Naive Bayes and Random Forest are more stable than other models.

Content from these authors
© 2017 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top