連続力学システムの自動制御のためのオンラインEM強化学習法

吉本 潤一郎; 石井 信; 佐藤 雅昭

doi:10.5687/iscie.16.209

吉本潤一郎, 石井信, 佐藤雅昭

著者情報

ジャーナルフリー

2003 年 16 巻 5 号 p. 209-217

DOI https://doi.org/10.5687/iscie.16.209

詳細

抄録

In this paper, we propose a new reinforcement learning (RL) method for dynamical systems that have continuous state and action spaces. Our RL method has an architecture like the actorcritic model. The critic tries to approximate the Q-function, and the actor tries to approximate a stochastic soft-max policy dependent on the Q-function. An on-line EM algorithm is used to train the critic and the actor. We apply this method to two control problems. Computer simulations in two tasks show that our method is able to acquire good control after a few learning trials.

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）