VAEによる異常検出器を用いた安全な探索を可能とする模倣学習

藤石 秀仁; 小林 泰介; 杉本 謙二

doi:10.1299/jsmermd.2020.2A2-J07

抄録

Behavioral cloning, which is one of the imitation learning methods, enables a robot to imitate an expert’s policy from the expert’s state and action demonstrations. In that case, the robot does not need to interact with environment, thereby preventing robot failure. However, in general, it is difficult to obtain expert action information. Although behavioral cloning from observation allows the robot to learn the policy without that, it requires a few interactions with the environment to infer expert action, which leaves the risk of robot failures. Detecting faced situations are safe or dangerous is an effective way to prevent such dangerous interactions. Suppose that the expert’s demonstrations only visited the safe states, this paper proposes a new outlier detector using variational autoencoder learned by the expert’s data. It can easily find unexperienced and dangerous scenes since all the data used for learning are mapped to limited space. The proposed method improved the policy performance in simulations with the limited number of robot failures.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

会員向け購読者番号とパスワードは以下URLよりご確認下さい。
https://www.jsme.or.jp/publication/proceedings/

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）