マルチエージェント部分観測可能マルコフ決定過程のプラン構築への通信の導入

田崎 誠; 籔 悠一; 横尾 真; Pradeep VARAKANTHAM; Janusz MARECKI; Milind TAMBE

doi:10.11309/jssst.25.4_226

Introducing Communication to Joint Policy Search Algorithm for Networked Distributed POMDPs

Makoto TASAKI, Yuichi YABU, Makoto YOKOO, Pradeep VARAKANTHAM, Janusz MARECKI, Milind TAMBE

Author information

JOURNAL FREE ACCESS

2008 Volume 25 Issue 4 Pages 4_226-4_237

DOI https://doi.org/10.11309/jssst.25.4_226

Details

Abstract

Multiagent Partially Observable Markov Decision Process (Multiagent POMDP) is a popular approach for modeling multi-agent systems acting in uncertain domains. An existing approach (Search for Policies In Distributed EnviRonments, SPIDER) guarantees to obtain an optimal joint plan by exploiting agent interaction structure. Using SPIDER, we can obtain an optimal joint policy for large-scale problems if the interaction among agents is sparse. However, the size of a local policy is still too large to obtain a policy which length is more than 4. To overcome this problem, we extends the SPIDER so that agents can communicate their observation history and action history each other. After communication, agents can start from a new synchronized belief state thus the combinatorial explosion of local policies is avoided. Our experimental results show that we can obtain much longer policies as long as the interval between communications is small.

Corresponding author

Register with J-STAGE for free!