Computer Software
Print ISSN : 0289-6540
Evaluation of Missing Value Imputation Methods for Effort Estimation Using Liner Regression
Koji TODAMasateru TSUNODA
Author information
JOURNAL FREE ACCESS

2017 Volume 34 Issue 4 Pages 4_150-4_155

Details
Abstract
Multivariate regression models have been commonly used to estimate the software development effort to assist project planning and/or management. Since project data sets for model construction often contain missing values, we need to build a complete data set that has no missing values either by using imputation methods. However, while there are several ways to build the complete data set, it is unclear which method is the most suitable for the project data set. In this paper, using project data of 1364 cases (34% missing value rate) collected from several companies, we applied four imputation methods (k-nn method, applied CF method, Miss Forest method and Multiple Imputation method) to build regression models. Then, using project data of 160 cases (having no missing values), we evaluated the estimation performance of models after applying each imputation method. The result showed that Multiple Imputation method showed the best performance.
Content from these authors
© 2017 Japan Society for Software Science and Technology
Previous article Next article
feedback
Top