Abstract
Q-learning is one of the famous algorithms for reinforcement learning. A usual way for expressing a Q-value function is using a Q-table which is a look-up table. But it is difficult to specify the discretize size of the state spaces without prior knowledge. In this paper, the method of the adaptive construction of state spaces on Q-learning by storing the data an agent has experienced is proposed. The effectiveness of this method is confirmed by some simulations of path-planning problems. Furthermore, the method of automatically setting the parameter for resolving the trade-off between exploration and exploitation is proposed.