複数の状態行動価値表を用いたR学習の高速化

石川 浩一郎; 櫻井 彰人; 藤波 努; 國藤 進

doi:10.1541/ieejeiss.126.72

Abstract

We propose a method to improve the performance of R-learning, a reinforcement learning algorithm, by using multiple state-action value tables. Unlike Q- or Sarsa learning, R-learning learns a policy to maximize undiscounted rewards. Multiple state-action value tables cause substantial explorations as needed and make R-learnings to work well. Efficiency of the proposed method is verified through experiments in simulation environment.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!