人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
論文
重み付けされた複数の正規分布を用いた政策表現
最適行動変化に追従できる実時間強化学習と環状ロボットへの適用
木村 元荒牧 岳志小林 重信
著者情報
ジャーナル フリー

2003 年 18 巻 6 号 p. 316-324

詳細
抄録
In this paper, we challenge to solve a reinforcement learning problem for a 5-linked ring robot within a real-time so that the real-robot can stand up to the trial and error. On this robot, incomplete perception problems are caused from noisy sensors and cheap position-control motor systems. This incomplete perception also causes varying optimum actions with the progress of the learning. To cope with this problem, we adopt an actor-critic method, and we propose a new hierarchical policy representation scheme, that consists of discrete action selection on the top level and continuous action selection on the low level of the hierarchy. The proposed hierarchical scheme accelerates learning on continuous action space, and it can pursue the optimum actions varying with the progress of learning on our robotics problem. This paper compares and discusses several learning algorithms through simulations, and demonstrates the proposed method showing application for the real robot.
著者関連情報
© 2003 JSAI (The Japanese Society for Artificial Intelligence)
次の記事
feedback
Top