Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
37th (2023)
Session ID : 2A1-GS-2-03
Conference information

Self-Examination Mechanism: Lightweight Defense Mechanism against Adversarial Examples using Explainable AI
*Sora SUEGAMIYutaro OGURIZaiying ZHAOYu KAGAYAKoki MUKAIShun YOSHIDAFu CHENToshihiko YAMASAKI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Deep learning-based image classification models are vulnerable to adversarial examples (AEs). Existing defense methods have improved the classification accuracy for AEs, but the classification accuracy for clean images without perturbations decreases. To solve this problem, we propose a new defense mechanism called self-examination mechanism. In the proposed method, the input image is first classified. Then, the inference process of the classification model is verified using SHapley Additive exPlanations (SHAP), a method of explainable AI. If the input image is abnormal, the classification is performed again based on the output of SHAP. Thus, misclassification of AEs can be prevented without significantly reducing the classification accuracy of clean images. Evaluations on ResNet and WideResNet trained with CIFAR10 demonstrate that our method improves the accuracy for AEs and hardly reduces the accuracy for clean images.

Content from these authors
© 2023 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top