Abstract
For learning behaviors of real robotic systems, we propose a method to extend reinforcement learning to the domain of continuous robot control problems. Reinforcement learning methods are applicable to a variety of fields, however generally they can handle only discrete states and actions and they sometimes require impractical amount of time to learn practical problems. In the proposing method, one of the connectionist models CMAC is employed for utility networks in order to represent continuous states and control variables. In order to fully utilize experiences and progress fast learning, experience sequences are stored while actual actuation and they are replayed with priorities afterall. As a testbed, the learning system is applied in simulation to the control of swing amplitude of a two-link brachiation robot which is hardly constrained with dynamics.