Abstract
In education for children and/or guidance of sports, it is important to give suitable instruction to children or players, whom we called agents in this paper. It is necessary to understand the ability and characteristic of the agents by observing the learning processes and to change the teaching methods as needed. In this paper, we consider the learning parameter estimation and adequate rewarding method, using simulation which agents learn maze problem. For learning of the maze, we used Q-learning well known in the field of reinforcement learning. And we conducted experiments using multiple agents with different learning parameters. Agents’ action data at the early stage of learning is used for learning parameter estimation using self-organizing map. After that, we change rewarding method based on the estimated agents’ learning parameters.