Host: The Japan Society of Mechanical Engineers
Name : [in Japanese]
Date : May 27, 2020 - May 30, 2020
Behavioral cloning, which is one of the imitation learning methods, enables a robot to imitate an expert’s policy from the expert’s state and action demonstrations. In that case, the robot does not need to interact with environment, thereby preventing robot failure. However, in general, it is difficult to obtain expert action information. Although behavioral cloning from observation allows the robot to learn the policy without that, it requires a few interactions with the environment to infer expert action, which leaves the risk of robot failures. Detecting faced situations are safe or dangerous is an effective way to prevent such dangerous interactions. Suppose that the expert’s demonstrations only visited the safe states, this paper proposes a new outlier detector using variational autoencoder learned by the expert’s data. It can easily find unexperienced and dangerous scenes since all the data used for learning are mapped to limited space. The proposed method improved the policy performance in simulations with the limited number of robot failures.