自己方策を用いた他者の意図推定に基づくマルチエージェント強化学習

不破 雅泰; 増山 岳人

doi:10.1299/jsmermd.2022.2A2-B10

Abstract

The complexity of Multi-Agent Reinforcement Learning (MARL) problems increases exponentially with the number of agents. Poor scalability to the number results in limited applications of MARL to large-scale multi-agent systems.

In this paper, we present a novel MARL algorithm leveraging a self-policy network to estimate the intentions of other agents.The intention of other agents is backpropagated from a self-policy network with the observed action of others. Estimated intentions are then used as input to the self-policy network. As long as the agents are cooperative, our method does not require any additional model to learn others’ intentions. We also introduce a simple curriculum learning, which gradually increases the number of agents. Simulation results indicated that the proposed method improves the performance of learned policy even if the number of agents increases.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!