個々のリスクを持つマルチエージェント強化学習のための罰成形手法

青谷 拓海; 小林 泰介; 小澤 隆太

doi:10.1299/jsmermd.2023.1P1-F22

Abstract

Multi-Agent Reinforcement Learning (MARL) is a framework that utilizes reinforcement learning to simultaneously learn policies for multiple agents, such as robots, within the same environment. One concern with reinforcement learning is that stochastic behavior during learning can lead to risk for the agent. In the context of MARL, appropriately avoiding risks such as collisions between agents, is necessary. This study aims to achieve mutual risk avoidance by evaluating the overall risk of multi-agent system (MAS) when each agent has its primitive risk. The focus is on the differences in the nature of rewards and penalties in MAS. The proposed method is designed based on risk evaluation using maximum punishment. Simulation results demonstrate that the proposed method achieves more advanced risk avoidance compared to risk evaluation based on mean punishment for all agents.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!