システム制御情報学会論文誌
Online ISSN : 2185-811X
Print ISSN : 1342-5668
ISSN-L : 1342-5668
論文
合意制御に基づく協調型トンプソン抽出の検討
神村 素輝林 直樹高井 重昌
著者情報
ジャーナル フリー

2020 年 33 巻 2 号 p. 57-65

詳細
抄録

Recently, distributed control for multi-agent systems has attracted much attention. Each agent makes a decision through interaction over a communication network. In general, there exists a trade-off between exploration of the best choice and exploitation of the obtained knowledge. Such a trade-off can be formulated as the bandit problem. In this paper, we investigate a distributed bandit problem where a group of agents cooperatively searches the best choice in a distributed manner. We propose a cooperative Thompson sampling based on the consensus algorithm of multi-agent systems. The theoretical analysis of a regret bound is carried out for the case when the communication network is represented by a complete graph. The numerical examples show that the regret can be reduced by the proposed cooperative Thompson sampling compared to the case when agents individually search the best choice without cooperation.

著者関連情報
© 2020 一般社団法人 システム制御情報学会
前の記事
feedback
Top