主観報酬を用いた強化学習における人間の教示特性に関する考察

黒田 将史; 山科 亮太; 藪田 哲郎

doi:10.1299/kikaic.79.1770

Abstract

This note presents an application of reinforcement learning to caterpillar robot locomotion. An excellent advantage of reinforcement learning is that an action can be acquired using only a simple reward. In our previous work, the reward was a forward distance measured using a sensor. This reward was completely an “Objective reward”. On the other hand, we have studied using the rewards given by the human's subjective judgment, which is defined as a “Subjective reward”. As a result, the “Subjective reward” gave better results than that of the “Objective reward” under specific conditions. The main purpose of this study is to investigate the teaching methods of human being, which is the main factor that gives better results than “Objective reward”. This note discusses the good teacher who gives an excellent “Subjective reward”. The results show that the human being gives reward in consideration of not only distance but also motion forms, and can give better results than that of the “Objective reward”.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!