2P1-F06 他者の状態価値の基づく協調・競合行動の獲得(ロボカップ・ロボットコンテスト)

野間 健太郎; 高橋 泰岳; 浅田 稔

doi:10.1299/jsmermd.2007._2P1-F06_1

Abstract

The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behaviors easily cause state and action space explosion. The keys for learning to acquire cooperative/competitive behaviors in such an environment are as follows:●a two-layer hierarchical system with multi learning modules is adopted to reduce the size of the sensor and action spaces. ●to what extent the other agent task has been achieved is estimated by observation and used as a state value in the top layer state space to accelerate the cooperative/competitive behavior learning. This paper presents a method of modular learning in a multiagent environment, by which the learning agent can acquire cooperative behaviors with its team mates and competitive ones against its opponents.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!