There is a known issue in image recognition models where they make predictions based on incorrect or unintended factors. To address this, we explore whether models can be evaluated by comparing the validity of their prediction rationale. Specifically, we propose a method that utilizes LIME to visualize the regions contributing to the prediction, calculates a score based on the consistency of these regions across models, and ranks models based on their total score, with higher-scoring models being considered more reliable. As a preliminary step, we conducted an experiment using the ImageNet evaluation dataset to verify whether models can be assessed by comparing their prediction rationale through visual inspection. The results revealed clear differences between models, confirming that such an evaluation is feasible. Furthermore, we applied our proposed method to 12 pre-trained models using the same dataset. The evaluation results indicate that DenseNet-121 and ResNet-50 achieved the highest scores, suggesting that the proposed method has a certain degree of utility as a simplified alternative to manual (visual) evaluation. In addition, certain models, such as ConvNeXt, exhibited lower scores despite having the highest classification accuracy, as their regions contributing to the prediction differed from those of other models. Despite making correct decisions based on valid reasoning, the proposed method in this paper was unable to distinguish such models from poor ones.
View full abstract