Transactions of the Japanese Society for Artificial Intelligence
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
Interactive Restless Multi-armed Bandit Game and Swarm Intelligence Effect
Shunsuke YoshidaMasato HisakadoShintaro Mori
Author information
JOURNAL FREE ACCESS Advance online publication

Article ID: 30-6_JWEIN-B

Details
Abstract

We define the swarm intelligence effect and obtain the condition for the emergence of it in an interactive game of restless multi-armed bandit where a player competes with multiple agents. Each arm in the bandit has a payoff which change with probability pc per round. Agents and a player choose one from three options: (1) Exploit (exploiting a good arm), (2) Innovate (asocial exploring for good arms), and (3) Observe (social exploring for good arms). Each agent has two parameters (c,pobs) to specify the decision: (i) c, the threshold value for Exploit. If the agent knows only arms whose payoffs are less than c, he chooses to explore. (ii)pobs, the probability for Observe when the agent explores. The parameters (c,pobs) of the agents are uniformly distributed. We introduce a scope nI for searching good arms in Innovate to control its cost. We determine optimal strategies of player using the complete knowledge about the bandit and the information of exploited arms by agents. We show which social or asocial exploring is optimal in (pc,nI) space. We conduct a laboratory experiment (67 subjects). If (pc,nI) is chosen so that social learning is far optimal than asocial learning, we observe the swarm intelligence effect. If (pc,nI) is in the region where asocial learning is optimal or comparable with social learning, we do not observe the effect.

Content from these authors
© The Japanese Society for Artificial Intelligence 2015
Previous article Next article
feedback
Top