IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Regular Section
Learning in Two-Player Matrix Games by Policy Gradient Lagging Anchor
Shiyao DINGToshimitsu USHIO
著者情報
ジャーナル 認証あり

2019 年 E102.A 巻 4 号 p. 708-711

詳細
抄録

It is known that policy gradient algorithm can not guarantee the convergence to a Nash equilibrium in mixed policies when it is applied in matrix games. To overcome this problem, we propose a novel multi-agent reinforcement learning (MARL) algorithm called a policy gradient lagging anchor (PGLA) algorithm. And we prove that the agents' policies can converge to a Nash equilibrium in mixed policies by using the PGLA algorithm in two-player two-action matrix games. By simulation, we confirm the convergence and also show that the PGLA algorithm has a better convergence than the LR-I lagging anchor algorithm.

著者関連情報
© 2019 The Institute of Electronics, Information and Communication Engineers
前の記事
feedback
Top