Host: The Japan Society of Mechanical Engineers
Name : [in Japanese]
Date : June 05, 2019 - June 08, 2019
Learning of the robot movement with reinforcement learning (RL) has attracted attention, and improvements of various RL methods have been carried out intensively. With conventional RL methods, however, a complicated task takes a long learning process, which is problematic in the robotics domain. In this paper, we focused on the compositionality of policies of Soft Q-learning (SoftQL). With SoftQL, it is possible to compose multiple already-learned policies and execute compound tasks efficiently. However, in the SoftQL, the action-sampling procedure and learning algorithm are complex due to the continuous action space. In this paper, we applied the SoftQL to a maze-solving problem which has discrete space and investigated its performance and computational tractability for discrete-space problems.