Transactions of the Society of Instrument and Control Engineers
Online ISSN : 1883-8189
Print ISSN : 0453-4654
ISSN-L : 0453-4654
Acceleration of Reinforcement Learning by Estimating State Transition Probability Model
Shinji FUJIIKei SENDASyusuke MANO
Author information
JOURNAL FREE ACCESS

2006 Volume 42 Issue 1 Pages 47-53

Details
Abstract
The Q-learning is one of typical reinforcement learning methods. Since the Q-learning requires huge amounts of time to solve a problem, this study proposes acceleration methods. This study introduces two approaches based on iteration methods of the dynamic programming to accelerate the learning. One is to use Robbins-Monro estimation of the state transition probability model. The other is application of iterative solving methods for an inverse matrix, e.g., Jacobi's method, Gauss-Seidel's method, SOR method, etc. Those allow us to determine an appropriate learning factor. Numerical simulations show that the proposed methods are more efficient than the Q-learning.
Content from these authors
© The Society of Instrument and Control Engineers (SICE)
Previous article Next article
feedback
Top