2018 年 84 巻 862 号 p. 17-00288
The field of multi-robot systems (MRSs), which deals with groups of autonomous robots, is recently attracting much research interest from robotics. MRSs are expected to achieve their tasks that are difficult to be accomplished by an individual robot. In MRSs, reinforcement learning (RL) is one of promising approaches for distributed control of each robot. RL allows participating robots to learn mapping from their states to their actions by rewards or payoffs obtained through interacting with their environment. Theoretically, the environment of MRSs is non-stationary, and therefore rewards or payoffs learning robots receive depend not only on their own actions but also on the action of other robots. From this point of view, an RL method which segments state and action spaces simultaneously and autonomously to extend the adaptability to dynamic environment, named Bayesian-discrimination-function-based Reinforcement Learning (BRL) has been proposed. In order to improve the learning performance of BRL, this paper proposes a technique of selecting either of two state spaces: one is parametric model useful for exploration and the other is non-parametric model for exploitation. The proposed technique is evaluated through computer simulations of a cooperative carrying task with six autonomous mobile robots.