Abstract
A multi-robot system is composed of multiple robots which are relatively simple. In this system, reinforcement learning is one of promising approaches for controlling each robot. However, its performance depends a great deal on the segmentation of state and action spaces. To deal with this problem, we have been developing a new technique, named BRL. This paper introduce a meta-learning mechanism to standard BRL in order to improve its learning ability. We investigate the performance of extended BRL through physical experiments. A task is that mobile robots orbit an object in an environment.