Abstract
A number of studies have applied machine-learning approaches to semantic role labeling with availability of corpora such as FrameNet and PropBank. These corpora define frame-specific semantic roles for each frame. It is crucial for the machine-learning approach because the corpus contain a number of infrequent roles which hinder an efficient learning. This paper focus on a generalization problem of semantic roles in a semantic role labeling task. We compare existing generalization criteria and our novel criteria, and clarify characteristics of each criterion. We also show that using multiple generalization criteria in a model improves the performance of a semantic role classification. In experiments on FrameNet, we achieved 19.16% error reduction in terms of total accuracy and 7.42% in macro F1 avarage. On PropBank, we reduced 24.07% of errors in total accuracy, and 26.39% of errors in the evaluation for unseen verbs.