抄録
We study a model for social-learning agents in a restless multiarmed bandit (rMAB). The
rMAB has one good arm that changes to a bad one with a certain probability. Each agent seeks the good
arm by random search with probability 1-r, or copying information from other agents (social learning) with
probability r. Each agent’s fitness is the probability to know the good arm. In this model, we explicitly
construct the unique Nash equilibrium state and show that the corresponding strategy for each agent is an
evolutionarily stable strategy (ESS) in the sense of Thomas. The ESS Nash equilibrium is a solution to
Rogers ’paradox. We also consider the space of mixed strategies and introduce a natural dynamics aiming
at increasing each agent’s fitness. It is shown that the dynamics converges to the ESS Nash equilibrium.