2017 年 2017 巻 AGI-006 号 p. 07-
Under large state and action spaces, it is difficult for a reinforcement learning agent to learn the agent's policy within a practical time. Previous studies have proposed methods in which a trainer gives better actions to a trainee to promote the learning. However, when action spaces of a trainer and a trainee is not the same, the instruction does not work without mapping from the instruction to the trainee's variable space. In this paper, we deal with three types of instruction: action-based expression, abstract expression from a human trainer, and expression output by Instruction-based Behavior Explanation, which is a framework to announce a reinforcement learning agent's future behavior. The three instructions were mapped to agents' action spaces with deep reinforcement learning, and we compared the mappings to consider the form of information towards heterogeneous agents' instruction.