Abstract
RoboCup Soccer Simulation League 2D is a game competition where independently moving software players (agents) play soccer on a virtual field inside a computer. This league is used as a testbed in research on multiagent systems. This paper deals with a policy of an agent when he holds a ball. In Agent2D, the policy uses a game tree whose nodes are predicted states caused by agents' actions and a positional evaluation function that evaluates the predicted states. We designed this positional evaluation using heuristics on soccer and applied the policy gradient reinforcement learning algorithm to learning weight parameters included in the evaluation function.