日本機械学会関東支部総会講演会講演論文集
Online ISSN : 2424-2691
ISSN-L : 2424-2691
セッションID: 11912
会議情報
11912 改良型罰回避政策形成アルゴリズムへの罰基底度閾値決定機構の導入と評価(OS7 ロボティックス・メカトロニクス(3),オーガナイズドセッション)
小林 諒平宮崎 和光小林 博明
著者情報
会議録・要旨集 フリー

詳細
抄録
Penalty Avoiding Rational Policy Making algorithm (PARP) based on Profit Sharing method and was planed to learn a penalty avoiding policy. PARP is improved to save memories and to cope with uncertainties. The efficiency of the Improved Penalty Avoiding Rational Policy Making algorithm is influenced by threshold of the penalty basis function γ significantly. Up to now, it is necessary to set appropriate γ through a preliminary experiment. In this paper, we propose a technique for learning γ with the multi start method. The proposal technique is applied to a keepaway task that is a benchmark in a robotic soccer game, to confirm the effectiveness.
著者関連情報
© 2010 一般社団法人 日本機械学会
前の記事 次の記事
feedback
Top