強化学習における線形計画法を用いた効率的解法

泉田 啓; 天野 恒佑

doi:10.9746/sicetr.52.566

Transactions of the Society of Instrument and Control Engineers

Online ISSN : 1883-8189
Print ISSN : 0453-4654
ISSN-L : 0453-4654

Paper

Efficient Algorithms for Reinforcement Learning by Linear Programming

Kei SENDA, Koyu AMANO

Author information

Keywords: reinforcement learning, planning, linear programming, self-transition probability

JOURNAL FREE ACCESS

2016 Volume 52 Issue 10 Pages 566-572

DOI https://doi.org/10.9746/sicetr.52.566

Details

Abstract

Model-based reinforcement learning includes two steps, estimation of a plant and planning. Planning is formulated as dynamic programming (DP) problem, which is solved by a DP method. This DP problem has an equivalent linear programming (LP) problem that can be solved by LP method, but it is generally less efficient than typical DP method. However, numerical examples show linear programming is more efficient than the typical DP method in problems whose self-transition probabilities are large. The reason is clarified by geometrical discussion of each solution of method approaches to optimal solution.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!