Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Reinforcement learning (RL) enables robots to flexibly learn skills from the interaction with the environment. However, robots move randomly to explore valuable actions at an early stage, which can be unsafe. Moreover, it takes a long time to learn motion from scratch. To learn more efficiently and safely, residual reinforcement learning (RRL) has been proposed. In RRL, skills are learned by correcting expert policy that can be obtained from an expert's demonstration. However, conventional RRL assumes a single expert policy, whereas we consider multiple policies for more complex tasks. In this paper, we propose RRL with multiple expert policies, where a selection of a suitable expert policy in the current state is also learned based on RL. Experimentally, we show that the agent can learn more accurate skills in the object alignment task.