Abstract
Recent advances in speech processing technologies have made the analysis of spoken language one of the central issues in natural language processing. However, it is difficult to apply traditional linguistic-based methods to spoken language analysis due to the ill-formedness of spoken language. We have proposed a spoken language analysis method based on a uniform model, which handles well-and ill-formed sentences in a uniform way. In this method, both the problem of finding the best interpretation of a sentence and that of detecting and recovering ill-formedness are resolved by finding the most preferred dependency analysis of the sentence. This paper presents a preference decision method for our spoken language analysis method. The method is corpus-based; the preference of an analysis candidate is determined according to how frequently such an analysis is observed in the training data. To overcome the data-sparseness problem, not only the instances exactly matching the candidate but also instances similar to the candidate are taken into account. We, first, overview our spoken language analysis method. Then, after providing the details of the preference decision method, we show its effectiveness with evaluating the performance of an experimental system.