報酬の設定を自動化した集中型高速マルチエージェント強化学習法

佐々木 薫; 飯間 等

doi:10.5687/iscie.35.39

Abstract

For multiagent environments, a centralized reinforcement learner can find optimal policies, but it is time-consuming. A method is proposed for finding the optimal policies acceleratingly, and it uses the centralized learner in combination with supplemental independent learners. In order to prevent the failure of learning, the independent learners must stop in a timely manner, which is done through finely tuning a reward. The reward tuning, however, requires additional time and effort. This paper proposes a reinforcement learning method in which the reward is automatically set.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!