実例に基づく強化学習法による失敗しない制御方法の学習

畝見 達夫

doi:10.11517/jjsai.7.6_1001

Abstract

We propose an instance-based learning algorithm named IBRL3 which learns how to avoid the negative reinforcement from environment. A cart-pole balancing problem and a monitoring ship navigation problem are used to certify its learing performance. In this algorithm, a tuple of input and output data of each execution cycle are stored in memory verbatim, and the action of each cycle is decided by retrieving the nearest neighbor of the current input data. The number of stored instances is reduced by replacing the nearest but less reliable instance by new one. Experimental results of computer simulation show that IBRL3 is robust for distinct settings of parameter and for noisy environments, and it is efficient enough to apply to real-time control problems.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!