Host: The Japanese Society for Artificial Intelligence
Name : The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 39
Location : [in Japanese]
Date : May 27, 2025 - May 30, 2025
“Epoch-wise Double Descent” refers to the phenomenon where test loss decreases again after overfitting in training with label noise. Traditional bias-variance trade-off theory cannot explain this phenomenon. In this study, we analyzed learning curves separated into the data with clean and noisy labels to understand the phenomenon further. We conducted numerical experiments with a 7-layer MLP using the CIFAR-10 data set with 30% label noise. The training process is visualized by separating the training loss into three elements: clean label data, noisy label data, and noisy label data evaluated with original labels. Our results reveal that the training process proceeds in three phases until the double descent occurs: (1) learning only clean label data, (2) learning data with noise labels causing test loss to increase, and (3) fitting the noisy labels perfectly, which leads to test loss decreasing and the double descent phenomena. These findings suggest that the double descent phenomenon arises from the model's over-fitting to noisy label data, which enhances the generalization of the model prediction again.