抄録
This paper aims to accelerate learning processes of actor-critic method, which is one of major reinforcement learning algorithms, by a transfer learning. Reinforcement learning allows agents to work out the target tasks, autonomously. Transfer learning is one of effective methods to accelerate learning processes of machine learning algorithms. It accelerates learning processes by using prior knowledge from a policy for a source task. Two basic issues for the transfer learning are method to select an effective source policy and method to reuse without negative transfer. In this paper, we mainly discuss the latter. We propose the reuse method based on our proposed selection method. In actor-critic, a policy is constructed by two parameter sets: action preferences and state values. To avoid negative transfer, agents reuse only reliable action preferences and state values that imply preferred actions. We perform simple experiments to show the effectiveness of the proposed method.