Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Unsupervised Domain Adaptations for Word Sense Disambiguation by Learning under Covariate Shift
Hiroyuki ShinnouMinoru Sasaki
Author information
JOURNAL FREE ACCESS

2014 Volume 21 Issue 5 Pages 1011-1035

Details
Abstract
In this paper, we apply the learning under covariate shift to the problem of unsupervised domain adaptation for word sense disambiguation (WSD). This learning is a type of weighted learning method, in which the probability density ratio w(x) = PT(x)/PS(x) is used as the weight of an instance. However, w(x) tends to be small in WSD tasks. In order to address this problem, we calculate w(x) by estimating PT(x) and PS(x), where PS(x) is estimating by regarding the corpus combining the source domain corpus and target domain corpus as the source domain corpus. In the experiment, we use three domains -OC (Yahoo! Chiebukuro), PB (books) and PN (news papers)- in BCCWJ, and 16 target words provided by the Japanese WSD task in SemEval-2. For calculating w(x), we also use uLSIF, which directly estimates w(x) without estimating PT(x) or PS(x). Moreover, we use the “p power” method and the “relative probability density ratio” method to boost the obtained probability density ratio. These experiments prove our method to be effective.
Content from these authors
© 2014 The Association for Natural Language Processing
Previous article Next article
feedback
Top