報酬寄与率を考慮したパラメータノイズによる深層強化学習の探索と活用の調節

狩野 泉実; 田中 一敏; 新山 龍馬; 國吉 康夫

doi:10.1299/jsmermd.2018.1A1-C15

Abstract

In recent years, reinforcement learning has developed rapidly with deep learning and achieves great performance not only in the game playing but also in the continuous control of robots. Reinforcement learning requires exploratory behavior, and action noise is widely used to realize it. Recent researches have tackled exploration problems in deep reinforcement learning by using parameter noise. It has been experimentally shown that parameter noise performs a better exploration than commonly used action noise. However, the methods used so far need long time to update noise distribution or explore uniformly in a huge parameter space by using isotropic noise distribution. This paper proposes a method which improves the update of the noise distribution for faster learning.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!