2025 年 91 巻 4 号 p. 510-517
As part of efforts to address the issue of fairness in computer vision, image datasets are being reviewed, and inappropriate datasets are being temporarily suspended or withdrawn from public availability. What impact do pre-trained models using large-scale image datasets have on downstream tasks from the perspective of fairness? In this paper, we quantitatively evaluate how the pre-training methods, including manually supervised learning (MSL) and self-supervised learning (SSL) pre-training with ImageNet, affect downstream tasks in terms of fairness. We reached three findings from our experiments, summarized as follows: (i) SimCLRv2 and MoCov2 pre-trained models perform more fairly than MSL pre-trained models with ImageNet on IMDB-WIKI, FairFace, and CIFAR-10S. (ii) The MoCov2 pre-trained model achieves better performance in accuracy and fairness metrics than the SimCLRv2 pre-trained model. (iii) In MoCov2, color jitter tends to improve the fairness metrics for downstream tasks. By considering these findings, we demonstrate that SSL pre-trained models have the potential for fairer image recognition with similar accuracy compared to the MSL pre-trained model. By replacing human annotations with self-supervision, we can construct fair pre-trained models does not depend solely on human-annotated labels in the context of pre-training.