2008 年 74 巻 747 号 p. 2747-2754
The paper proposes an extended method for improving robustness of reinforcement learning called BRL. BRL has a novel character that the continuous state space and the continuous action space are segmented autonomously and simultaneously in the online-learning process. We have presented elsewhere that BRL is an effective technique not only for single robot problems but also for multi-robot problems. In BRL, the continuous state space is segmented by the Bayesian discrimination function method based on the instances perceived from each episode. On the other hand, the continuous action space is segmented by the same method based on randomly generated actions. This seems reasonable when a perceived state is apparently different from the states in the acquired rules. But it seems inappropriate when a perceived state is somewhat similar to the states in the acquired rules. Therefore, in the latter case, we propose an extension of BRL such that an action is calculated as the weighted linear interpolation of the actions in the similar rules. After showing the formalization of the proposed extension, the navigation problem of an autonomous mobile robot is demonstrated to verify the improvement by the proposed method through computer simulation as well as physical experiments.