自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
一般論文(査読有)
Weighted Asymmetric Loss for Multi-Label Text Classification on Imbalanced Data
Yuki YasudaTaro MiyazakiJun Goto
著者情報
ジャーナル フリー

2024 年 31 巻 3 号 p. 1166-1192

詳細
抄録

Multi-label text classification, which assigns multiple labels to a single text, is a key task in natural language processing. In this task, a model is often trained on an imbalanced dataset whose label frequencies follow a long-tail distribution. Low-frequency labels that rarely appear in training data have an extremely small number of positive samples, so most of the input samples are negative. Therefore, the model learns low-frequency labels with the loss value dominated by the negative samples. In this research, we propose a method called weighted asymmetric loss that combines the appearance frequency weight of labels, the weight that suppresses the loss value derived from negative samples, and a label smoothing method in accordance with the co-occurrences of each label. Experimental results demonstrate that the proposed method improves the accuracy compared to existing methods, especially on imbalanced datasets.

著者関連情報
© 2024 The Association for Natural Language Processing
前の記事 次の記事
feedback
Top