評価値付き入力ベクトルを扱う自己組織化マップを用いたエージェントの学習パラメータに応じた報酬設計手法

堀尾 恵一; 森 逸平; 古川 徹生

doi:10.14864/fss.34.0_140

34th Fuzzy System Symposium

Session ID : MC2-4

DOI https://doi.org/10.14864/fss.34.0_140

Conference information

Host: Japan Society for Fuzzy Theory and Intelligent Informatics (SOFT)

Name : 34th Fuzzy System Symposium

Number : 34

Location : [in Japanese]

Date : September 03, 2018 - September 05, 2018

proceeding

Reward Design Method Adapting to Agents' Learning Ability based on Self-Organizing Map with Evaluation Value

*Keiichi HORIO, Ippei MORI, Tetsuo FURUKAWA

Author information

Keywords: input with evaluation value, reinforcement learning agent, reward design

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In education for children and guidance of sports, it is important to give appropriate instruction to learners. It is necessary to grasp the ability and characteristic of the learner by observing the learning process and to change the teaching method as needed. In this paper, we consider the learning parameter and appropriate giving rewards method, using simulation data which makes agent learn maze. For learning of the maze, we used Q-learning well known in the field of reinforcement learning. And we conducted experiments using multiple agents with different learning parameters. Agent behavior data at the middle stage of learning is classified by SOM and learning parameters are estimated. After that, we change the giving rewards method, and consider it according to the learning parameters from learning result.

Corresponding author

Conference information

Register with J-STAGE for free!