Host: The Japanese Society for Artificial Intelligence
Name : The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 39
Location : [in Japanese]
Date : May 27, 2025 - May 30, 2025
Recent advances in reinforcement learning for multi-agent environments have underscored the importance of Opponent-Modeling, where agents infer internal states or strategies of opponents. Recent studies have explored AutoEncoder-based latent representations that limit access to opponent information during execution for Opponent-Modeling in partially observable environments. In reinforcement learning, the state input to the policy and value function in a Markov decision process (MDP) must satisfy the Markov property and serve as a sufficient statistic for future reward prediction. However, under partial observability, many opponent modeling approaches focus solely on reconstructing opponent information in the latent representation, without ensuring that it retains Markovian or reward-predictive properties. To overcome this limitation, we propose a representation learning method that models not only the opponent but also the agent itself. We validated our method through experiments, demonstrating its effectiveness in improving reinforcement learning performance.