Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 4D3-GS-2-02
Conference information

Policy Iteration for Stationary Stackelberg Equilibria in General-sum Stochastic Games
Proposal of Pareto-optimal Policies in terms of Staclelberg Equilibria and Probable Convergence Guarantee of the Iterative Method by Policy Improvements
*Mikoto KUDOYohei AKIMOTO
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

A stochastic game is a game model where agents simultaneous maximize their cumulative rewards. A Stackelberg equilibrium is defined as a pair of policies that maximize the leader agent's return when the follower agent's policy is always the best response against the leader's one. Stationary Stackelberg equilibria (SSE) are not always exist, and existing methods require strong assumptions to guarantee the convergence and the coincidence of the limit with the SSE. We propose an alternative solution concept, Pareto-optimal (PO) policies, and an algorithm for PO policies based on the policy iteration. Our method monotonically approaches the Pareto front by iterative local policy improvements.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top