Journal of Signal Processing
Online ISSN : 1880-1013
Print ISSN : 1342-6230
ISSN-L : 1342-6230
Countermeasure against Backdoor Attack on Neural Networks Utilizing Knowledge Distillation
Kota YoshidaTakeshi Fujino
著者情報
ジャーナル フリー

2020 年 24 巻 4 号 p. 141-144

詳細
抄録

A backdoor attack is a well-known security issue facing deep neural networks (DNNs). In a backdoor attack against DNNs for image classification, an adversary creates tampered data containing special marks ("poison data") and injects them into a training dataset. A DNN model that is trained with the tampered training dataset can achieve a high classification accuracy for clean (normal) input data, but the inference on the poisonous input data is misclassified to the adversarial target label. In this work, we propose a countermeasure against the backdoor attack by utilizing knowledge distillation in which the DNN model user distills a backdoored DNN model with clean unlabeled data. The distilled DNN model can be trained with clean knowledge on the backdoored model because the backdoor is not activated by clean data. Experimental results showed that the distilled model achieves high performance equivalent to that of a clean model without a backdoor.

著者関連情報
© 2020 Research Institute of Signal Processing, Japan
前の記事 次の記事
feedback
Top