In this report, we show that the problem of domain adaptation for word sense disambiguation (WSD) can be treated as a covariate shift problem, and we try to solve it by maximizing the log-likelihood by weighting the probability density ratio, which is the standard solution of covariate shift. The key to solving this problem lies in the estimation of the probability density ratio. We estimate the probability density ratio using simple method employing the Naive Bayes model. In our proposed method, we apply the covariate shift method to the training data expanded by the Daumé’s feature augmentation method. In the experiment, we solve six types of domain adaptations for WSD using three domains, viz., OC (Yahoo! Chiebukuro), PB (Book), and PN (Newspaper) in the BCCWJ corpus. The results show that our proposed method outperforms the Daumé’s method. This report shows that even our simple method of estimating the probability density ratio is effective for use in the covariate shift method. In future, we intend to investigate and find a method of estimating the probability density ratio more accurately. Further, we intend to use the SVM instead of the maximum entropy method. Moreover, the method of covariate shift is also effective for unsupervised domain adaptations and is a promising approach for WSD domain adaptations.
View full abstract