迷路探索問題に対するSoft Q-learningの適用と方策合成性の検証

松岡 潤樹; 鶴峯 義久; 松原 崇充

doi:10.1299/jsmermd.2019.1P2-A09

Abstract

Learning of the robot movement with reinforcement learning (RL) has attracted attention, and improvements of various RL methods have been carried out intensively. With conventional RL methods, however, a complicated task takes a long learning process, which is problematic in the robotics domain. In this paper, we focused on the compositionality of policies of Soft Q-learning (SoftQL). With SoftQL, it is possible to compose multiple already-learned policies and execute compound tasks efficiently. However, in the SoftQL, the action-sampling procedure and learning algorithm are complex due to the continuous action space. In this paper, we applied the SoftQL to a maze-solving problem which has discrete space and investigated its performance and computational tractability for discrete-space problems.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!