Abstract
Typical fuzzy reinforcement learning algorithms are based on value-function approaches such as fuzzy Q-learning in MDPs and constant or linear functions are used in the conclusion parts of fuzzy rules. In this paper, we propose a reinforcement learning algorithm based on policy-function approaches where fuzzy functions are used in the conclusion parts as a policy function of an agent.