抄録
This paper argues how a compact humanoid robot can acquire a giant-swing motion without any robotic models by using Q-Learning method. Generally, it is widely said that Q-Learning is not appropriated for learning dynamic motions because Markov property is not necessarily guaranteed during the dynamic task. However, we tried to solve this problem by embedding the angular velocity state into state definition and averaging Q-Learning method to reduce dynamic effects, although there remain non-Markov effects in the learning results. The result shows how the robot can acquire a giant-swing motion by using Q-Learning algorithm. The successful acquired motions are analyzed in the view point of dynamics in order to realize a functionally giant-swing motion. Finally, the result shows how this method can avoid the stagnant action loop at around the bottom of the horizontal bar during the early stage of giant-swing motion.