Host: The Japan Society of Mechanical Engineers
Name : [in Japanese]
Date : June 02, 2018 - June 05, 2018
In recent years, reinforcement learning has developed rapidly with deep learning and achieves great performance not only in the game playing but also in the continuous control of robots. Reinforcement learning requires exploratory behavior, and action noise is widely used to realize it. Recent researches have tackled exploration problems in deep reinforcement learning by using parameter noise. It has been experimentally shown that parameter noise performs a better exploration than commonly used action noise. However, the methods used so far need long time to update noise distribution or explore uniformly in a huge parameter space by using isotropic noise distribution. This paper proposes a method which improves the update of the noise distribution for faster learning.