自己方策を用いた他者の意図推定に基づくマルチエージェント強化学習

不破 雅泰; 増山 岳人

doi:10.1299/jsmermd.2022.2A2-B10

抄録

The complexity of Multi-Agent Reinforcement Learning (MARL) problems increases exponentially with the number of agents. Poor scalability to the number results in limited applications of MARL to large-scale multi-agent systems.

In this paper, we present a novel MARL algorithm leveraging a self-policy network to estimate the intentions of other agents.The intention of other agents is backpropagated from a self-policy network with the observed action of others. Estimated intentions are then used as input to the self-policy network. As long as the agents are cooperative, our method does not require any additional model to learn others’ intentions. We also introduce a simple curriculum learning, which gradually increases the number of agents. Simulation results indicated that the proposed method improves the performance of learned policy even if the number of agents increases.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

会員向け購読者番号とパスワードは以下URLよりご確認下さい。
https://www.jsme.or.jp/publication/proceedings/

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）