世界モデルにおけるモデルサイズに対するスケーリング則

佐藤 誠人; 海野 良介; 根岸 優大; 田畑 浩大; 渡部 泰樹; 蒲原 惇乃輔; 久米 大雅; 岡田 領; 岩澤 有祐; 松尾 豊

doi:10.11517/pjsai.JSAI2023.0_2G5OS21e02

37th (2023)

Session ID : 2G5-OS-21e-02

DOI https://doi.org/10.11517/pjsai.JSAI2023.0_2G5OS21e02

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence

Number : 37

Location : [in Japanese]

Date : June 06, 2023 - June 09, 2023

Scaling Laws of Model Size for World Models

*Makoto SATO, Ryosuke UNNO, Masahiro NEGISHI, Koudai TABATA, Taiju WATANABE, Junnosuke KAMOHARA, Taiga KUME, Ryo OKADA, Yusuke IWASAWA, Yutaka MATSUO

Author information

*Makoto SATO
Nara Institute of Science and Technology
Matsuo Institute
Ryosuke UNNO
The University of Tokyo
Matsuo Institute
Masahiro NEGISHI
The University of Tokyo
Matsuo Institute
Koudai TABATA
The University of Tokyo
Matsuo Institute
Taiju WATANABE
Waseda University
Matsuo Institute
Junnosuke KAMOHARA
Tohoku University
Matsuo Institute
Taiga KUME
Keio University
Matsuo Institute
Ryo OKADA
The University of Tokyo
Matsuo Institute
Yusuke IWASAWA
The University of Tokyo
Yutaka MATSUO
The University of Tokyo

Keywords: World Models, Large Language Models, Scaling Laws

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

With the development of deep learning, significant performance improvements have been achieved in computer vision and natural language processing. In these advancements, scaling laws that demonstrate exponential changes in model performance with respect to model size, dataset size, and computational resources used for training have played a significant role. These scaling laws have been reported to hold for various tasks, including image classification, image generation, and natural language processing. However, it has not yet been verified whether these scaling laws are effective for tasks that involve long-horizon predictions. In this study, we investigate the validity of scaling laws for world models from the perspective of model size. We conduct experiments that scale the model sizes of two world models in a video prediction task conditioned on actions using the CARLA dataset, and verify that the loss function decreases exponentially and the scaling law holds when including large-scale autoencoder.

Corresponding author

Conference information

Register with J-STAGE for free!