IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<Softcomputing, Learning>
Proposal of a Propagation Algorithm of the Expected Failure Probability and the Effectiveness on Multi-agent Environments
Hiroki MuraokaKazuteru MiyazakiHiroaki Kobayashi
Author information
JOURNAL FREE ACCESS

2016 Volume 136 Issue 3 Pages 273-281

Details
Abstract

It is known that Improved Penalty Avoiding Rational Policy Making algorithm (IPARP) can learn policies by a reward and a penalty. IPARP aims to identify penalty rules that have a high possibility to receive a penalty. Though IPARP is effective in many cases, it needs many trial-and-error searches due to memory constraints. In this paper, we propose a method called Expected Failure Probability Algorithm (EFPA) to speed it up. In addition, we extend EFPA to multi-agent environments. In multi-agent learning, it is important to avoid concurrent learning problem that occurs when multiple agents learn simultaneously. We also propose a method to avoid the problem and confirm the effectiveness by numerical experiments.

Content from these authors
© 2016 by the Institute of Electrical Engineers of Japan
Previous article Next article
feedback
Top