Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper
Construction and Analysis of Multiword Expression-aware Dependency Corpus
Akihiko KatoHiroyuki ShindoYuji Matsumoto
Author information
JOURNAL FREE ACCESS

2019 Volume 26 Issue 4 Pages 663-688

Details
Abstract

Multiword expressions (MWEs) consist of multiple words with syntactic or semantic non-compositionality. Natural Language Processing (NLP) tasks exploiting syntactic dependency information and requiring the understanding of the meaning of texts prefer the use of MWE-aware dependency trees (MWE-DTs)—where each MWE is a syntactic unit—to word-based dependency trees. To treat various continuous MWEs as syntactic units in dependency trees, this study conducts adjective MWE annotations on the OntoNotes corpus and constructs a dependency corpus that is aware of both the functional and adjective MWEs. In NLP tasks requiring a semantic understanding, it is also important to recognize verbal MWEs (VMWEs) such as phrasal verbs, which are likely to have discontinuous occurrences. Since dependency information can be used as an effective feature in VMWE recognition, this study examines the tasks to predict both MWE-DTs and VMWEs. For MWE-DTs, it explores the following three models: (a) a pipeline model of continuous MWE recognition (CMWER) and MWE-aware dependency parsing, (b) a model to predict a word-based dependency tree that encodes MWE spans as dependency labels (the head-initial dependency parser), and (c) the hierarchical multitask learning (HMTL) model of CMWER and the model in (b). The experimental results show that the pipeline and HMTL-based models show similar F1-scores in CMWER, which are 1.7 points better than the F1-score of the head-initial dependency parser. With respect to VMWE recognition, the results show an F1 improvement of 1.3 points by integrating the sequential labeler into the above mentioned HMTL-based model.

Content from these authors
© 2019 The Association for Natural Language Processing
Previous article Next article
feedback
Top