ロボティクス・メカトロニクス講演会講演概要集
Online ISSN : 2424-3124
セッションID: 1P1-C10
会議情報
1P1-C10 自己・他者の状態価値推定に基づくチーム行動の生成
島田 皓樹高橋 泰岳浅田 稔
著者情報
会議録・要旨集 フリー

詳細
抄録
This paper presents a method that utilizes state value functions of macro actions to explore appropriate behavior efficiently in a multi-agent environment. First, the agent learns a few macro actions and the state value functions based on reinforcement learning beforehand. Second, an appropriate initial controller for learning cooperative behavior is generated based on the state value functions. The initial controller utilizes the state values of the macro actions so that the learner tends to select a good macro action. By combination of the ideas and a two-layer hierarchical system, the proposed method shows better performance during the learning than conventional methods. This paper shows a case study of 4 (defense team) on 5 (offense team) game task, and the learning agent (a passer of the offense team) successfully acquired the teamwork plays (pass and shoot) within shorter learning time.
著者関連情報
© 2009 一般社団法人 日本機械学会
前の記事 次の記事
feedback
Top