2008 Volume 2008 Issue DMSM-A703 Pages 09-
In reinforcement learning, the use of a linear model for value function approximation is promising due to its high scalability to large-scale problems. When we use such a method in practical reinforcement learning problems, how we choose an appropriate model for good approximation is quite important because the approximation performance heavily depends on the choice of the model. In this paper, we propose a new method of model selection with sample data, and we demonstrate the effectiveness of the proposed method in chain walk and inverted pendulum problems.