Domain Adaptation using Word Embeddings for Word Sense Disambiguation

Kanako Komiya; Minoru Sasaki; Hiroyuki Shinnou; Manabu Okumura

doi:10.5715/jnlp.25.463

抄録

In this paper, we propose domain adaptation using word embeddings for word sense disambiguation (WSD). The validity for WSD of word embeddings derived from a huge corpus such as Wikipedia had already been shown, but their validity in a domain adaptation framework has not been previously discussed. If word embeddings are valid in this new context, the impact of the document type of the corpora on WSD is still unknown. Therefore, we investigate the performances of domain adaptation in WSD using word embeddings from the source, target and general corpora and examine (1) whether the word embeddings are valid for domain adaptation of WSD and (2) if they are, the effects of the document type of the corpora from which the word embeddings are derived. We used three corpora of distinct document types and performed domain adaptation experiments using the document types as the domains. The experiments, conducted using Japanese corpora, revealed that the accuracy of WSD was highest when we used the word embeddings obtained from the target corpora together with a general corpora.

著者関連情報

Licensed under CC BY 4.0
https://creativecommons.org/licenses/by/4.0/

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）