Abstract
The Japanese language has many compound functional expressions which consist of more than one words including both content words and functional words, e.g., “_??_” and “_??_”. However, recognition and semantic interpretation of compound functional expressions are especially difficult because it often happens that one compound expressions may have both a literal content word usage and a non-literal functional usage. This paper proposes an approach of processing Japanese compound functional expressions by identifying them and analyzing their dependency relations through a machine learning technique. First, we formalize the task of identifying Japanese compound functional expressions in a text as a machine learning based chunking problem. Next, against the results of identifying compound functional expressions, we apply the method of dependency analysis based on the cascaded chunking model. In the experimental evaluation, we first show that the proposed method of chunking compound functional expressions significantly outperforms existing Japanese text processing tools. Next, we further show that, for many types of functional expressions, the cascaded chunking model applied to the results of identifying compound functional expressions outperforms the one applied to the results without identifying compound functional expressions.