状態汎化とマルチエージェント化による大規模システムの強化学習

木村 元; 青木 圭; 小林 重信

doi:10.1541/ieejias.123.1091

Abstract

This paper introduces several problems in reinforcement learning of industrial applications, and shows some techniques to overcome it. Reinforcement learning is known as on-line learning of an input-output mapping through a process of trial and error interactions with its uncertain environment, however, the trial and error will cause fatal damages in real applications. We introduce a planning method, based on reinforcement learning in the simulator. It can be seen as a stochastic approximation of dynamic programming in Markov decision processes. But in large problems, simple grid-tiling to quantize state space for tabular Q-learning is still infeasible. We introduce a generalization technique to approximate value functions in continuous state space, and a multiagent architecture to solve large scale problems. The efficiency of these techniques are shown through experiments in a sewage water-flow control system.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!