Proceedings of the Fuzzy System Symposium
30th Fuzzy System Symposium
Session ID : MD2-3
Conference information

main
Reinforcement learning using probability distribution to state values
*Wataru SatoKanta Tachibana
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Reinforcement learning is a method to learn the optimal behavior through trial and error in an unknown environment. If the environment is strongly non-stationary, the agent takes a long time to learn the optimal behavior. There have been various studies in order to solve this problem. As far as we know, these methods have structure which consists of recognition of environmental change and response to environment. In the conventional method, agent has sensor to cognition environmental change and switch the optimal behavior and the exploring behavior. In our method, the optimal behavior and the exploring behavior can be decided according to probability distribution by Bayesian updating state values of probability distribution.

Content from these authors
© 2014 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top