1P1-C10 自己・他者の状態価値推定に基づくチーム行動の生成

島田 皓樹; 高橋 泰岳; 浅田 稔

doi:10.1299/jsmermd.2009._1P1-C10_1

抄録

This paper presents a method that utilizes state value functions of macro actions to explore appropriate behavior efficiently in a multi-agent environment. First, the agent learns a few macro actions and the state value functions based on reinforcement learning beforehand. Second, an appropriate initial controller for learning cooperative behavior is generated based on the state value functions. The initial controller utilizes the state values of the macro actions so that the learner tends to select a good macro action. By combination of the ideas and a two-layer hierarchical system, the proposed method shows better performance during the learning than conventional methods. This paper shows a case study of 4 (defense team) on 5 (offense team) game task, and the learning agent (a passer of the offense team) successfully acquired the teamwork plays (pass and shoot) within shorter learning time.

著者関連情報

お気に入り & アラート

閲覧履歴

Relationship with the Characteristics of Help-Seeking Preference of Caregivers and the Consciousness for Support: Between Maternal Anxiety and Imminent Support, and Local Support Activity Participation

発行機関からのお知らせ

会員向け購読者番号とパスワードは以下URLよりご確認下さい。
https://www.jsme.or.jp/publication/proceedings/

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）