Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
32nd (2018)
Session ID : 3Pin1-05
Conference information

Batch Reinforcement Learning for Linearly Solvable MDP
*Tomoki NISHIKeisuke OTAKITakayoshi YOSHIMURA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Linearly solvable Markov decision process (L-MDP) is an essential subclass of MDP to find a better policy efficiently. We first develop a novel batch reinforcement learning algorithm for L-MDP in discretized action space. The algorithm simultaneously learns a state value function and a predictor of state values at next step by using pre-collected data. We evaluate our method on traffic signal control domain in a single intersection with the traffic simulator SUMO. Our experiment demonstrates that our method finds the policy on the domain efficiently.

Content from these authors
© 2018 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top