大小の言語モデルに基づく双方向な表現獲得

五藤 巧; 長尾 浩良; 是枝 祐太

doi:10.11517/pjsai.JSAI2024.0_3Xin226

38th (2024)

Session ID : 3Xin2-26

DOI https://doi.org/10.11517/pjsai.JSAI2024.0_3Xin226

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence

Number : 38

Location : [in Japanese]

Date : May 28, 2024 - May 31, 2024

Acquiring Bidirectionality via Large and Small Language Model

*Takumi GOTO, Hiroyoshi NAGAO, Yuta KOREEDA

Author information

Keywords: Natural Language Processing, Large Language Model, Named Entity Recognition

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In this study, we raise the issue of uni-directionality when applying large causal language models to classical NLP tasks. As a solution, we propose a method of utilizing the concatenated representations of a newly trained small-scale backward language model as input for downstream tasks. Through experiments in named entity recognition tasks, we demonstrate that introducing backward model improves the benchmark performance more than 10 points. Furthermore, we report that the proposed method is especially effective for rare domains and in few-shot learning settings.

Corresponding author

Conference information

Register with J-STAGE for free!