システム制御情報学会論文誌
Online ISSN : 2185-811X
Print ISSN : 1342-5668
ISSN-L : 1342-5668
論文
状態非依存の方策を用いた新しい強化学習手法の提案
中野 太智前田 新一石井 信
著者情報
ジャーナル フリー

2014 年 27 巻 8 号 p. 327-332

詳細
抄録

Usually, reinforcement learning (RL) algorithms have a difficulty to learn the optimal control policy as the dimensionality of the state (and action) becomes large, because of the explosive increase in the search space to optimize. To avoid such an unfavorable explosive increase, in this study, we propose BASLEM algorithm (Blind Action Sequence Learning with EM algorithm) which acquires a state-independent and time-dependent control policy starting from a certain fixed initial state. Numerical simulation to control a non-holonomic system shows that RL of state-independent and time-dependent policies attain great improvement in efficiency over the existing RL algorithm.

著者関連情報
© 2014 システム制御情報学会
次の記事
feedback
Top