Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Semi-Supervised Japanese Word Sense Disambiguation Based on Two-Stage Classification of Unlabeled Data and Ensemble Learning
Tatsukuni InoueHiroaki Saito
Author information
JOURNAL FREE ACCESS

2011 Volume 18 Issue 3 Pages 247-271

Details
Abstract

In this paper, we propose a bootstrapping-like method which eases optimal and empirical parameter selection for Japanese word sense disambiguation. Bootstrapping means, in this paper, semi-supervised learning methods based on the following procedures: (1) train a classifier on labeled examples, (2) use the classifier to select confident unlabeled examples, (3) add them to the labeled examples, (4) repeat steps 1–3. Traditional bootstrapping methods require empirical selection for the parameters including the pool size, the number of the most confident examples and the number of iterations. Our method uses two-stage unlabeled example classification based on heuristics and a supervised method (Maximum Entropy classifier) and combines a series of classifiers along a sequence of varying conditions. This method requires only one parameter and enables parameter robust word sense disambiguation. Experiments compared with the baseline supervised method on the Japanese WSD task of SemEval-2 shows that our method obtained accuracy improvement between 1.8 and 1.56 points.

Content from these authors
© 2011 The Association for Natural Language Processing
Previous article Next article
feedback
Top