抄録
Recently, as robotics has been developed, it is expected that robots are workforce taking place of people. Conventionally, rules for controlling the robots are designed by a designer, but under unknown environment with various uncertainties, it is difficult to design the rules previously. Therefore, there are many researches on controlling autonomous robots with learning abilities, especially reinforcement learning attracts attention. Reinforcement learning largely depends on parameters. Accordingly, it is necessary to develop an algorithm which can set parameters autonomously according to the state of the environment and the advance condition of learning while learning. In this research, it aims at developing a new algorithm controlling the parameters based on only reward information while learning.