2022 Volume 39 Issue 4 Pages 4_31-4_37
Open Source Software (OSS) Development Projects are always looking for contributions from new developers to maintain the sustainability of the project. Some projects use a label called “Good First Issue (GFI)” to support onboarding by preparing issues for new developers. However, GFI labels are not initiatively used in many projects because labeling is done manually by project maintainers, which is burdensome for maintainers. The aim of this research is to construct a machine learning model to automatically classify issues for new developers in OSS projects. In this paper we describe the results of constructing a classification model using random forests method. We collected about 150,000 regular issues and about 10,000 GFIs, and conducted a 10-fold cross validation, resulting in a Precision of 0.91 and a Recall of 0.30 (RQ1). We also analyzed the feature value with high importance and found that the role of the contributor in a project is important for the classification of GFIs (RQ2).