強化学習における方策評価の効率化による学習の加速

泉田 啓; 服部 俊; 幸田 武久

doi:10.9746/sicetr.49.696

抄録

Typical methods for solving reinforcement learning problems iterate two steps, policy evaluation and policy improvement. This study proposes algorithms for the policy evaluation to improve learning efficiency. The proposed algorithms, based on the Krylov Subspace Method (KSM), are tens to hundreds times more efficient than existing algorithms based on the Stationary Iterative Methods (SIM). Algorithms based on KSM are far more efficient than they have been generally expected. This study clarifies what makes algorithms based on KSM makes more efficient with numerical examples and theoretical discussions.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

自然界の知恵から学ぶ脈動流による電子機器冷却の新展開
化学療法が著効した転移性肝癌に対する腹腔鏡下肝外側区域切除
メソスケールの擾乱による融雪と地すべり
残留応力解析に基づくAl₂O₃/Cu傾斜機能材料の創製
Modified Atmosphere (MA) and 1-Methylcyclopropene (1-MCP) Combination Treatment Extends the Postharvest Life of Carnations

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）