Abstract
In our laboratory, we have succeeded in acquiring forward actions to various robot systems using Reinforcement Learning. We have also succeeded in acquiring a giant swing motion as dynamic task by devising its rewards. Then, the purpose of this study is to clarify probabilistic behavior of giant swing. Although the giant swing robot has a continuous dynamic motion such as its angle and angler velocity, its state of the motion must be divided into 216 states in order to apply the reinforcement learning. For this reason, this robot shows probabilistic behaviors. Consequently, it became clear that collapse of the learning knowledge may happen which is defined as Q value that could acquire the giant swing motion become impossible to acquire this motion by increasing learning count. The results show that 100% successful rotation cannot be obtained even when its learning converges sufficiently.