Proceedings of the Fuzzy System Symposium
30th Fuzzy System Symposium
Session ID : MD2-2
Conference information

main
The Effect of UCB Algorithm in Reinforcement Learning
*Koki SaitoAkira NotsuKatsuhiro Honda
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

UCB algorithm was proposed as one of the action choice methods used in a multi-armed bandit problem. In this method, an agent chooses the action by comparing upper bound of confidence intervals of estimated values, thereby it has a better performance than others, like ε-greedy. In this paper, we proposed the method to apply UCB algorithm to Q-learning, and experimentally evaluated its performance by the shortest path problem in the continuous state spaces.

Content from these authors
© 2014 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top