複数のエキスパートポリシーによるResidual Reinforcement Learningを用いた整列動作の学習

柳沼 和樹; 中村 友昭; 嘉藤 佑亮; 長井 隆行; 小澤 順

doi:10.11517/pjsai.JSAI2020.0_1Q4GS1102

34th (2020)

Session ID : 1Q4-GS-11-02

DOI https://doi.org/10.11517/pjsai.JSAI2020.0_1Q4GS1102

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : 34th Annual Conference, 2020

Number : 34

Location : Online

Date : June 09, 2020 - June 12, 2020

Learning Alignment Tasks Based on Residual Reinforcement Learning with Multiple Expert Policies

*Kazuki YAGINUMA, Tomoaki NAKAMURA, Yusuke KATO, Takayuki NAGAI, Jun OZAWA

Author information

Keywords: Residual Reinforcement Learning, Reinforcement Learning, Gaussian process-hidden semi-Markov model (GP-HSMM)

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

Reinforcement learning (RL) enables robots to flexibly learn skills from the interaction with the environment. However, robots move randomly to explore valuable actions at an early stage, which can be unsafe. Moreover, it takes a long time to learn motion from scratch. To learn more efficiently and safely, residual reinforcement learning (RRL) has been proposed. In RRL, skills are learned by correcting expert policy that can be obtained from an expert's demonstration. However, conventional RRL assumes a single expert policy, whereas we consider multiple policies for more complex tasks. In this paper, we propose RRL with multiple expert policies, where a selection of a suitable expert policy in the current state is also learned based on RL. Experimentally, we show that the agent can learn more accurate skills in the object alignment task.

Corresponding author

Conference information

Register with J-STAGE for free!