学習空間を全体空間とその部分空間で二重化する強化学習法

松井 博和; 西澤 智恵子; 野村 由司彦

doi:10.7210/jrsj.37.620

Abstract

In this paper, we propose a Q-learning method by using dual Q-table. Concretely, the proposed method has two Q-tables: “Whole Q-table'' is larger, based on the whole space (in enough detail for learning optimal actions) of the environment and “Partial Q-table'' is smaller, based on a subspace (rough for learning outline actions) of the whole space. The two Q-tables simultaneously learn the environment based on a selected action. The action is selected by using the more learned Q-table out of the two Q-tables by each step. We simulated the proposed method, comparing with conventional ones, under three conditions of the learning environments, where the partial Q-table can learn optimal actions at the highest rate, middle rate and the lowest rate of the situations in the environment. As a result, we indicated that the proposed method can learn the optimal actions at any rates. As the rate is higher, it converges earlier. Even if at the lowest rate, the proposed method is almost as effective as conventional one. And we indicated the proposed method to be effective by using mathematical analysis. Furthermore, we verified that the proposed method was effective under an actual environment.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!