Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Iterated Prisoner's Dilemma (IPD) has been a standard tool for social dilemma. As the classic game-theoretic analyses of IPD have ended up mutual defection, another class of IPDs with reinforcement learners have been explored. However, the basic nature of such class of games themselves have not been well understood yet. In the present paper, we analyzed the Nash equilibria of IPD between reinforcement learners. In the standard IPD, it has been known that the only Nash equilibrium as a result of the rationale choices is the worst result for both players. However, unlike both previous lines of research, our analysis showed that in IPD with reinforcement learners the individually rational choices will correspond with the mutually beneficial result for both players. This result suggests that the social dilemma has been dissolved between this type of learning agents.