Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Robots have the possibility of breaking down, and when in an environment where access is limited, they still need to accomplish tasks they are required to do, even when reparation is not a possibility. The purpose of this research is to derive a policy using reinforcement learning that produces a high-performance robot, even in the case of failure. The proposed method learns the normal transition function and adds the difference between the predicted state transition and the actual state transition to the input of the policy network. The results of the experiment show that our method outperformed the baseline method that uses no state transition differen.