自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
一般論文(査読有)
Dataset Distillation with Attention Labels for Fine-tuning BERT
Aru MaekawaNaoki KobayashiKotaro FunakoshiManabu Okumura
著者情報
ジャーナル フリー

2025 年 32 巻 1 号 p. 283-299

詳細
抄録

Dataset distillation aims to create a small dataset of informative synthetic samples to rapidly train neural networks that retain the performance of the original dataset. In this study, we focus on constructing distilled few-shot datasets for natural language processing (NLP) tasks to fine-tune pre-trained transformers. Specifically, we propose introducing attention labels, which can efficiently distill knowledge from the original dataset and transfer it to transformer models via attention probabilities. We evaluated our dataset distillation methods in four NLP tasks and demonstrated that it is possible to create distilled few-shot datasets with attention labels, yielding an impressive performance for fine-tuning BERT. Specifically, in AGNews, which is a four-class news classification task, our distilled few-shot dataset achieved up to 93.2% accuracy, which is 98.5% that of the original dataset, even with only one sample per class and only one gradient step.

著者関連情報
© 2025 The Association for Natural Language Processing
前の記事 次の記事
feedback
Top