大規模言語モデルによる指示文拡張と二段階事前学習を用いた世界モデルのオフライン事前学習手法の評価

高円 悠聖; 藤間 裕史; 武田 康宏; 河野 慎; 松尾 豊

doi:10.11517/pjsai.JSAI2025.0_1B5OS41c02

Abstract

Recent studies have demonstrated that offline data, such as text, can significantly enhance the efficiency of task learning through the pretraining of world models. In particular, Dynalang has demonstrated its effectiveness in leveraging task instructions and environmental dynamics to enhance performance. However, its application has been primarily limited to the Messenger task, leaving its generalizability to other tasks and the impact of text type and quality in pretraining insufficiently explored. In this study, we extend Dynalang's approach to the simpler HomeGrid task to evaluate its generalizability. We also explore the use of large language models (LLMs) to generate and expand domain-specific text, aiming to further improve initial task performance and sample efficiency. Additionally, we propose and assess a two-stage pretraining strategy: general text is first used to develop fundamental language understanding, followed by domain-specific text to strengthen task-specific capabilities. Our findings highlight the potential of expanding the applicability of text-based pretraining strategies.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!