IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<Softcomputing, Learning>
Theoretical Learning Goal Selection for Non-Communicative Multi-Agent Cooperation
Fumito UwanoKeiki Takadama
Author information
JOURNAL RESTRICTED ACCESS

2020 Volume 140 Issue 1 Pages 75-84

Details
Abstract

This paper extended PMRL as the non-communicative and theoretical method for two agents, and proposed PLA as the method to be able to force agents to learn cooperative behavior for any number of agents. In addition, this paper adds the theoretic explanation for PLA that all agents achieve all purposes without spending the largest times. Concretely PLA forces each agent to avoid the more difficult purposes requiring many time to be reached by limiting the purpose which it can achieve, and it forces the agents to learn cooperative policy as achieving the appropriate purpose among the limited purposes. The experimental results in this paper derive that (1) PLA enables the agents to learn cooperative policy in the two grid world problems for three and five agents, and (2) PLA can force all agents to achieve all purposes in the problems with the minimum time.

Content from these authors
© 2020 by the Institute of Electrical Engineers of Japan
Previous article Next article
feedback
Top