主催: 一般社団法人 日本機械学会
会議名: ロボティクス・メカトロニクス 講演会2023
開催日: 2023/06/28 - 2023/07/01
It is mostly challenging to implement reinforcement learning due to vast search space. To address this issue, Zhang et al. proposed Value Disagreement Sampling (VDS), which sets pseudo-goals based on degree of disagreement within the multiple value functions. However, the VDS approach may not set contributory pseudo-goals to learn task objectives. In this paper, we aim to enhance the learning efficiency by sampling pseudo-goals based on the state-action distribution sampled from current policy. Simulation results demonstrate the effectiveness of the proposed approach in improving learning efficiency, especially during the later stages of the learning process.