Transactions of the Institute of Systems, Control and Information Engineers
Online ISSN : 2185-811X
Print ISSN : 1342-5668
ISSN-L : 1342-5668
Analysis of Distributed Thompson Sampling based on Consensus Control
Motoki KamimuraNaoki HayashiShigemasa Takai
Author information
JOURNAL FREE ACCESS

2020 Volume 33 Issue 2 Pages 57-65

Details
Abstract

Recently, distributed control for multi-agent systems has attracted much attention. Each agent makes a decision through interaction over a communication network. In general, there exists a trade-off between exploration of the best choice and exploitation of the obtained knowledge. Such a trade-off can be formulated as the bandit problem. In this paper, we investigate a distributed bandit problem where a group of agents cooperatively searches the best choice in a distributed manner. We propose a cooperative Thompson sampling based on the consensus algorithm of multi-agent systems. The theoretical analysis of a regret bound is carried out for the case when the communication network is represented by a complete graph. The numerical examples show that the regret can be reduced by the proposed cooperative Thompson sampling compared to the case when agents individually search the best choice without cooperation.

Content from these authors
© 2020 The Institute of Systems, Control and Information Engineers
Previous article
feedback
Top