11912 改良型罰回避政策形成アルゴリズムへの罰基底度閾値決定機構の導入と評価(OS7 ロボティックス・メカトロニクス(3),オーガナイズドセッション)

小林 諒平; 宮崎 和光; 小林 博明

doi:10.1299/jsmekanto.2010.16.87

抄録

Penalty Avoiding Rational Policy Making algorithm (PARP) based on Profit Sharing method and was planed to learn a penalty avoiding policy. PARP is improved to save memories and to cope with uncertainties. The efficiency of the Improved Penalty Avoiding Rational Policy Making algorithm is influenced by threshold of the penalty basis function γ significantly. Up to now, it is necessary to set appropriate γ through a preliminary experiment. In this paper, we propose a technique for learning γ with the multi start method. The proposal technique is applied to a keepaway task that is a benchmark in a robotic soccer game, to confirm the effectiveness.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

セルフチューニングPID制御系の一設計
23pDK-5 ASTRO-H衛星搭載硬X線軟ガンマ線検出器におけるBGOアクティブシールドの地上較正試験
金属とカーボンおよびシリコン酸化物同時ガス中蒸発法によるクラスターの構造 : 気相成長V
横倉山（高知県）周辺におけるザトウムシ目の記録
第１章　粒子の性質と測定 1. 3　粒子径分布測定法

発行機関からのお知らせ

会員向け購読者番号とパスワードは以下URLよりご確認下さい。
https://www.jsme.or.jp/publication/proceedings/

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）