Abstract
Multi-robot systems (MRS) can be expected to solve a task which one robot cannot perform. In MRS, reinforcement learning is one of promising approaches for controlling each robot. However, its performance depends a great deal on the segmentation of state and action spaces. To deal with this problem, we have been developing a new technique which segments state and action spaces autonomously, named BRL. In order to improve the learning performance, this paper introduces mechanism of selecting either of two state spaces: one is parametric model useful for exploration and the other is non-parametric model for exploitation. We investigate our proposed technique, by conducting physical experiments for a cooperative carrying task with three autonomous mobile robots.