IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Reward-Based Exploration: Adaptive Control for Deep Reinforcement Learning
Zhi-xiong XULei CAOXi-liang CHENChen-xi LI
Author information
JOURNAL FREE ACCESS

2018 Volume E101.D Issue 9 Pages 2409-2412

Details
Abstract

Aiming at the contradiction between exploration and exploitation in deep reinforcement learning, this paper proposes “reward-based exploration strategy combined with Softmax action selection” (RBE-Softmax) as a dynamic exploration strategy to guide the agent to learn. The superiority of the proposed method is that the characteristic of agent's learning process is utilized to adapt exploration parameters online, and the agent is able to select potential optimal action more effectively. The proposed method is evaluated in discrete and continuous control tasks on OpenAI Gym, and the empirical evaluation results show that RBE-Softmax method leads to statistically-significant improvement in the performance of deep reinforcement learning algorithms.

Content from these authors
© 2018 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top