Article ID: 2024DAT0002
The generalization performance of machine learning models deteriorates when the models are trained with mislabeled data. Existing methods to address mislabeled data rely on pre-processing or in-processing of the training. However, those methods require retraining when applied to trained models. As the model size and dataset size increase, the cost of retraining the model becomes a significant issue, necessitating the development of new approaches. In this paper, we propose a new method to remove mislabeled data from trained models without retraining via machine unlearning. Our proposed method consists of two stages: first, detecting mislabeled data from trained models, and second, unlearning these data from the models. We conduct extensive experiments on the MNIST dataset to evaluate our proposed method. To comprehensively evaluate the effectiveness of our proposed method, we perform individual experiments for the detection stage and the unlearning stage. Our findings demonstrate that the detection stage performs well when the proportion of mislabeled data is low, and the unlearning stage effectively enhances model accuracy. However, in an integrated experiment involving both stages,we observed intriguing yet negative results: despite the effectiveness of individual stages, model accuracy did not improve due to the high proportion of mislabeled data. Our code is available at https://github.com/speed1313/mislabel-unlearning.