強化学習における遷移確率を用いたドメイン適応による方策の転移

佐藤 怜; 福地 一斗; 佐久間 淳; 秋本 洋平

doi:10.11517/pjsai.JSAI2020.0_2J5GS203

34th (2020)

Session ID : 2J5-GS-2-03

DOI https://doi.org/10.11517/pjsai.JSAI2020.0_2J5GS203

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : 34th Annual Conference, 2020

Number : 34

Location : Online

Date : June 09, 2020 - June 12, 2020

Policy Transfer in Reinforcement Learning with Domain Adaptation using Transition Probability

*Rei SATO, Kazuto FUKUCHI, Jun SAKUMA, Youhei AKIMOTO

Author information

Keywords: Reinforcement Learning, Transfer Learning, Domain Adaptation, Representation Learning

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

Reinforcement learning is drawing increasing attentions in real world applications.Since it often takes enormous cost to learn the agent in the real world environment (called target task), pre-training in a low-cost environment such as a simulator (called source task) is gathering attention. In this paper, we focus on the situation where the source and target tasks are different only in the form of state observation. Our proposed method trains encoders mapping state observation to latent representations, and trains a policy that receives a latent representation and output an action.We utilize the transition probability to learn latent representations robust to changes in the form of state observation.This enables transferring the policy learned in the source task to improve the performance in the target task.Experiments show that our method can achieve higher performance when the number of interactions in the target task is limited.

Corresponding author

Conference information

Register with J-STAGE for free!