Short-term memory ability of reservoir-based temporal difference learning model

Yu Yoshino; Yuichi Katori

doi:10.1587/nolta.13.203

抄録

A network model with temporal difference (TD) learning and reservoir computing (RC) has been proposed to control autonomous robots. RC is a framework for constructing a recurrent neural network that processes complex time series with less computational cost. TD learning is a framework of reinforcement learning, which realizes that an agent takes actions in an environment to maximize the cumulative reward. The control model using TD learning with RC realize the optimization of agent's action based on the sensory signal that is a continuous-valued time-varying signal. The model uses online reinforcement learning to train the connection weights between the reservoir and the output layer to represent the action value. In the present study, we evaluate the model with a task requiring short-term memory and clarify the reservoir's role in memorizing task-relevant sensory information. We show that the reservoir in the RC-based TD learning model enhances the performance in the memory-required task. The choice of parameter values that specify the reservoir dynamics is critical to ensure performance in the task.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）