Abstract
This paper proposes a multi-layered reinforcement learning system that integrates lower learning modules and generates one of higher purposive behaviors based on which an autonomous robot learns from lower level behaviors to higher level ones through its life time. We decompose a large state space at the bottom level into several subspaces and merge those subspaces at the higher level. This allows the system to reuse the policies already learned and to learn the policy against the new features. As a result, curse of dimension is avoided. To show its validity, we apply the proposed method to a simple soccer situation in the context of RoboCup, and show the experimental results.