抄録
Multi-agent systems appear in a wide variety of fields such as engineering, economics and so on. There have been many studies on the multi-agent reinforcement learning in which each autonomous agent acquires its own action by reinforcement learning. Dilemma problems are typical classes of multi-agent problems. In these problems, the best action for each agent differs from the best action for the group of agents, which makes them difficult to solve. A typical example of dilemma problems is N-persons Iterated Prisoner's Dilemma (NIPD). There have been proposed several learning methods for the dilemma problems, especially for NIPD. In this paper we consider a class of dilemma problems and propose a reinforcement learning method which can learn the cooperative actions in the dilemma situations. Furthermore we apply the proposed method to NIPD and the Tragedy of the Commons and investigate its performance. It is shown through the numerical experiments that the proposed method makes it possible to learn the cooperative actions and possesses superior performance to that of the existing methods.