抄録
This study is to propose a configuration of a mobile robot system that is able to achieve a new movement under the situation where some of its actuators are broken and replaced by alternative ones, which may not be the same configuration as the original ones. The proposed method is based on a Reinforcement Learning and is modified so that it can achieve rapid conversion over a wide search space. To this end, a "growth of action-value" method is proposed, which enables effective exploration of an action space based on temporal reliability of each action-value. A series of 3D simulation-based experiments are conducted, where the proposed method shows rapid conversion to a good candidate of movement patterns.