Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
Deep reinforcement learning (DRL) is known to be vulnerable to adversarial attacks. For real-world applications, it is necessary to improve the robustness of DRL agents. Therefore, in this study, we propose a targeted manipulation attack method that specifies the behavior of the victim agent assuming a real-world attack in order to investigate the vulnerability. As a threat model, we consider a situation in which an attacker can generate perturbations to the observations of victim agent. The goal of the attacker is to manipulate the victim agent. The attacker expresses the desired behavior as a trajectory and attacks the victim agent to imitate it. In this study, we use imitation learning to realize the attack. Finally, we confirm that the targeted manipulation attack succeeds under the threat model set by our experiments on MetaWorld, a benchmark for reinforcement learning.