Abstract
Typical fuzzy reinforcement learning algorithms take value-function based approaches such as fuzzy Q-learning in MDPs and use constant or linear functions in the conclusion parts of fuzzy rules. In a previous paper, we proposed a reinforcement learning algorithm in the policy-gradient approach. Our method can deal with fuzzy sets even in the conclusion parts and also learn rule weights of fuzzy rules. This paper shows that the proposed learning method is applicable and effective to a decision making problem of a soccer robot that plays in RoboCup Soccer Small Size League by experiments.