Profit Sharing強化学習法における忘却操作に関する一考察

幸若 完壮; 渡辺 浩太; 五十嵐 一

doi:10.14864/fss.27.0.304.0

Abstract

In this paper, a forgettable rational Prot Sharing reinforcement learning method, which can learn much faster than conventional method without no additional parameters, is proposed. Furthermore, this method can suit for environment changes very quickly. Reinforcement Learning (RL) is one of the eective unsupervised learning techniques. However, the most of RL method do not immediately present an appropriate solution, even though the size of problem is small. Prot Sharing (PS) is a novel method to solving this diculty. It learn better solution much faster, rather than learning optimum solutions denitely . However, its learning speed becomes quite slow when a learner try to learn long term tasks. There are several approaches to accelerate its learning speeds, however, these method relatively converge to inappropriate solutions. The proposed method lets learner to forget a particular experiences that make its policy worse in order to improve its policy. Two numerical examples are demonstrated to show its eectiveness.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!