Transactions of the Institute of Systems, Control and Information Engineers
Online ISSN : 2185-811X
Print ISSN : 1342-5668
ISSN-L : 1342-5668
Papers
Proposal of New Reinforcement Learning with a State-independent Policy
Daichi NakanoShin-ichi MaedaShin Ishii
Author information
JOURNAL FREE ACCESS

2014 Volume 27 Issue 8 Pages 327-332

Details
Abstract

Usually, reinforcement learning (RL) algorithms have a difficulty to learn the optimal control policy as the dimensionality of the state (and action) becomes large, because of the explosive increase in the search space to optimize. To avoid such an unfavorable explosive increase, in this study, we propose BASLEM algorithm (Blind Action Sequence Learning with EM algorithm) which acquires a state-independent and time-dependent control policy starting from a certain fixed initial state. Numerical simulation to control a non-holonomic system shows that RL of state-independent and time-dependent policies attain great improvement in efficiency over the existing RL algorithm.

Content from these authors
© 2014 The Institute of Systems, Control and Information Engineers
Next article
feedback
Top