Host: The Japan Society of Mechanical Engineers
Name : [in Japanese]
Date : June 28, 2023 - July 01, 2023
It is mostly challenging to implement reinforcement learning due to vast search space. To address this issue, Zhang et al. proposed Value Disagreement Sampling (VDS), which sets pseudo-goals based on degree of disagreement within the multiple value functions. However, the VDS approach may not set contributory pseudo-goals to learn task objectives. In this paper, we aim to enhance the learning efficiency by sampling pseudo-goals based on the state-action distribution sampled from current policy. Simulation results demonstrate the effectiveness of the proposed approach in improving learning efficiency, especially during the later stages of the learning process.