Host: The Japanese Society for Artificial Intelligence
Name : The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 39
Location : [in Japanese]
Date : May 27, 2025 - May 30, 2025
In recent years, large language models (LLM) have been advancing rapidly worldwide, emphasizing the growing importance of cultivating capabilities within Japan. This paper presents an LLM development project led by the Matsuo and Iwasawa Lab as part of the GENIAC project, whose primary goal is to foster domestic expertise and reinforce national development capacity. Volunteers from the public worked with the lab to create 8B and 8×8B models from scratch. When we began our research in April 2024, domestically developed models still faced certain challenges in dialogue and text generation. On the other hand, our approach focused on improving dialogue and composition through synthetic data. Evaluations using the widely recognized “Japanese MT-Bench” indicated that our 8B model surpassed existing 10B-class models, while our 8×8B model performed on par with GPT-3.5, placing it at the forefront among domestically developed LLMs. Both models and their training code have been released under the Apache License 2.0, contributing to academic research and industrial applications of Japanese LLMs.