Abstract
In this paper, we propose a new framework which can be applied to a major class of reinforcement learning methods. It enables autonomous robots to obtain behavioral concepts incrementally through on-line interactions with their environments and rewards. This framework is based on J. Piaget's schema theory, that puts emphasis on the Co-existence of the two processes ; assimilation and accommodation, and equilibration and differentiation. This approach is aiming at the realization of a social robot which can obtain many behaviors through interactions with its users and its environments. Our framework can be applied to any TD-learning methods. This paper presents the results of two experiments. The first one deals with Q-learning, and the other one deals with Dual-Schemata model based reinforcement learning. In both cases, agents obtain some behavioral concepts without any explicit indications about differences between those behaviors by their supervisors. Moreover, it is shown that rewards to learning robots are given a new role as recalling the most suitable schema.