Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Using Semi-supervised Learning for Question Classification
Tri Thanh NguyentLe Minh NguyentAkira Shimazu
Author information
JOURNAL FREE ACCESS

2008 Volume 15 Issue 1 Pages 3-21

Details
Abstract

Question classification, an important phase in question answering systems, is the taskof identifying the type of a given question among a set of predefined types.This studyuses unlabeled questions in combination with labeled questions for semi-supervisedlearning, to improve the precision of question classification task.For semi-supervisedalgorithm, we selected Tri-training because it is a simple but efficient co-training stylealgorithm.However, Tri-training is not well suitable for question data, so we give twoproposals to modify Tri-training, to make it more suitable.In order to enable itsthree classifiers to have different initial hypotheses, Tri-training bootstrap-samplesthe originally labeled set to get different sets for training the three classifiers.Theprecisions of three classifiers are decreased because of the bootstrap-sampling.Withthe purpose to avoid this drawback by allowing each classifier to be initially trainedon the originally labeled set while still ensuring the diversity of three classifiers, ourfirst proposal is to use multiple algorithms for classifiers in Tri-training;the secondproposal is to use multiple algorithms for classifiers in combination with multipleviews, and our experiments show promising results.

Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top