抄録
Reinforcement Learning is a machine learning method to acquire a series of actions that maximizes a cumulative reward. However, it is difficult to optimize interaction between human and robot in a daily living space because there is no definite evaluation standard about undesirable actions. In this study, we propose a novel learning model using a successive reward and punishment based on human subjective evaluation. In this method, we developed human can restrain undesirable actions by giving punishment evaluation. We developed a dog-like robot to verify the proposed method and demonstrated its performance through the experiment.