Abstract
In recent years, in the field of the entertainment robot and nursing care robot, several methods for embedding human emotions to the robot have been developed actively. In this paper, in order to realize emotion generating and emotion behavior more appropriate for a living thing, we embed to a robot about the model of neuromodulators that exists in human's brain and propose the learning method of the emotion behavior using Q-Learning that has meta-parameter control. In this method, we propose a target selection-type Q-Learning method with plural Q-values concerning the maximization and minimization of rewards and punishments. We aim at the realization of a system to obtain complicated emotion behaviors selected based on the positive and negative evaluation according to the situation. Furthermore, we also report the result of an experiment by computer simulation and Kansei evaluation to confirm the efficiency of proposed method.