抄録
This paper describes a convergence estimation method and learning termination method for reinforcement learning in dynamic environment. In recent years, multi-robot systems utilizing reinforcement learning have been developing in real-world situations. However, conventional learning methods take a considerable amount of time to reach convergence. Furthermore, conventional learning processes are often inefficient because robot continues reinforcement learning even if learning converges. In response to this problem, we propose a Knowledge Co-creation Framework (KCF) for multi-robot systems, whose efficient implementation requires an autonomous convergence estimation method and learning termination method for reinforcement learning. Therefore,on basis of the assumption that learning curves exhibit fractality, we propose the convergence estimation method and the learning termination method utilizing a fractal dimensional analysis. Furthermore, we confirmed that the proposed method determines the learning convergence and terminates the reinforcement learning by conducting a computer simulation.