IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
Reinforcement Learning to Change Action Selection Probabilities by Motivation Values
Tomoki HamagamSeiichi KoakutsuHironori Hirata
Author information
JOURNAL FREE ACCESS

2002 Volume 122 Issue 12 Pages 2157-2164

Details
Abstract
This paper proposes a new reinforcement learning method based on “Motivation Value” which changes action selection probabilities in order to realize policy which depends on state-action context. The motivation value which this paper defines is a parameter which emphasizes (or de-emphasizes) specific action selection probabilities temporarily, and controls the next action selection probability indirectly at the control phase. Furthermore motivation value is recorded using the form corresponding to each state-action pair like an Q-value and updated with advance of learning. The feature of the method to propose is a practical advantage to be implemented by the comparatively easy extension from general reinforcement learning. In order to investigate the validity of proposed method, the method was applied to the maze problem containing perceptual aliasing problem. Experimental results show this method is effective technique as a learning algorithm under the non-Markov decision process environment which contains perceptual aliasing problems.
Content from these authors
© The Institute of Electrical Engineers of Japan
Previous article Next article
feedback
Top