Reinforcement Learning Using Feedforward Neural Network with Memory Mechanism

Seiichi OZAWA; Naoto SHIRAGA

doi:10.9746/sicetr1965.39.1129

Abstract

In reinforcement learning problems, the agent learns what to do so as to maximize numerical rewards. In many cases, the agent learns its proper actions through the estimation of an action-value function. When the agent's states are continuous, the action-value function cannot be represented by a lookup table in general. A solution for this problem is that a neural network is utilized for approximating it. However, when neural networks are trained incrementally, input-output relationships that are trained formerly tend to be collapsed by given new data. This phenomenon is called “interference”. Since the rewards are incrementally given from the environment, the interference could be also serious in reinforcement learning problems. To solve this problem, we propose a memory-based reinforcement learning model that is composed of Resource Allocating Network and memory. The distinctive feature of the proposed model is that it needs quite a small main memory to execute the accurate learning of action-value functions. To examine this feature, the proposed model is applied to the two conventional problems: Random Walk Task and Extended Mountain-Car Task. In these tasks, the learning domains are temporally expanded in order to evaluate the incremental learning ability. In the simulations, we verify that the proposed model can approximate proper action-value functions with quite a small main memory as compared with the conventional approaches.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!