自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
一般論文
A Simple and Effective Usage of Word Clusters for CBOW Model
Yukun FengChenlong HuHidetaka KamigaitoHiroya TakamuraManabu Okumura
著者情報
ジャーナル フリー

2022 年 29 巻 3 号 p. 785-806

詳細
抄録

We propose a simple and effective method for incorporating word clusters into the Continuous Bag-of-Words (CBOW) model. Specifically, we propose replacing infrequent input and output words in CBOW with their clusters. The resulting cluster-incorporated CBOW model produces embeddings of frequent words and a small amount of cluster embeddings, which will be fine-tuned in downstream tasks. We empirically demonstrate that our replacing method works well on several downstream tasks. Through our analysis, we also show that our method is potentially useful for other similar models that produce word embeddings.

著者関連情報
© 2022 The Association for Natural Language Processing
前の記事 次の記事
feedback
Top