Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
33rd (2019)
Session ID : 3Rin2-07
Conference information

Multi-armed bandit algorithm applicable to stationary and non-stationary environment using self-organizing maps
*Nobuhito MANOMEShuji SHINOHARAKouta SUZUKIKosuke TOMONAGAShunji MITSUYOSHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

A communication robots aiming to satisfy the users facing them needs to take appropriate behavior more rapidly. However, user requests often change while these robots are determining the most appropriate behavior for these users. Therefore, it is difficult for robots to derive an appropriate behavior. Such problems are formulated as a multi-armed bandit problem. To solve this problem, we proposed a multi-armed bandit algorithm capable of adaptation to stationary and non-stationary environments using self-organizing map. In this study, numerous experiments were conducted considering a stochastic multi-armed bandit problem in both stationary and non-stationary environments. Consequently, the proposed algorithm demonstrated equivalent or improved effectiveness in stationary environments with numerous arms and consistently strong capability in non-stationary environments regardless of the number of arms in contrast with existing UCB1, UCB1-Tuned, and Thompson Sampling algorithms.

Content from these authors
© 2019 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top