IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach
Zhi-xiong XULei CAOXi-liang CHENChen-xi LIYong-liang ZHANGJun LAI
著者情報
ジャーナル フリー

2018 年 E101.D 巻 9 号 p. 2315-2322

詳細
抄録

The commonly used Deep Q Networks is known to overestimate action values under certain conditions. It's also proved that overestimations do harm to performance, which might cause instability and divergence of learning. In this paper, we present the Deep Sarsa and Q Networks (DSQN) algorithm, which can considered as an enhancement to the Deep Q Networks algorithm. First, DSQN algorithm takes advantage of the experience replay and target network techniques in Deep Q Networks to improve the stability of neural networks. Second, double estimator is utilized for Q-learning to reduce overestimations. Especially, we introduce Sarsa learning to Deep Q Networks for removing overestimations further. Finally, DSQN algorithm is evaluated on cart-pole balancing, mountain car and lunarlander control task from the OpenAI Gym. The empirical evaluation results show that the proposed method leads to reduced overestimations, more stable learning process and improved performance.

著者関連情報
© 2018 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top