Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
36th (2022)
Session ID : 2O6-GS-5-02
Conference information

Learning Algorithm Using Replicator Mutator-Dynamics in Two-Player Zero-Sum Games
Mitsuki SAKAMOTO*Kentaro TOYOSHIMAKenshi ABEAtsushi IWASAKI
Author information
Keywords: Agent, Machine Learning
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In this study, we consider a variant of the Follow the Regularized Leader (FTRL) dynamics in two-player zero-sum games. FTRL is guaranteed to converge to a Nash equilibrium when time-averaging the strategies, while many variants suffer from the issue of limit cycling behavior, i.e., lacks the last-iterate convergence guarantee. To resolve this issue, we propose a mutation-driven FTRL (M-FTRL), an algorithm that introduces mutation for the perturbation of action probabilities. We then investigate the continuous-time dynamics of M-FTRL and provide the strong convergence guarantees toward stationary points which approximate Nash equilibria under full-information feedback. Furthermore, our simulation demonstrates that M-FTRL can enjoy faster convergence rates than FTRL and optimistic FTRL under full-information feedback and surprisingly exhibits clear convergence under bandit feedback.

Content from these authors
© 2022 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top