Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
JDMWE: A Japanese Dictionary of Multi-Word Expressions
Kosho ShudoToshifumi Tanabe
Author information
JOURNAL FREE ACCESS

2010 Volume 17 Issue 5 Pages 5_51-5_74

Details
Abstract
Since (Sag et al. 2002) is presented, the NLP society has been aware that one of the most crucial problems in NLP is how to cope with idiosyncratic multiword expressions, which occur in authentic sentences with unexpectedly high frequency. Here, the idiosyncrasy of expression is twofold in principle; one is idiomaticity, i.e. non-compositionality of meaning and the other is the strong probabilistic boundness of word combination. Thus, many trials to extract those expressions from corpora by using mostly statistical method have been made in NLP field. However, presumably because of the difficulty with their correct extraction without human insight, no reliable, extensive resource has yet been available. Authors recognized the crucial importance of such irregular expressions in around 1970 and started to develop a machine dictionary which contains Japanese idioms, idiom-like expressions and other multiword expressions which consist of frequently co-occurring words. In this paper, we give an overview of the first version of the dictionary, namely JDMWE (Japanese Dictionary of Multi-Word Expressions). It has about 104,000 head entries and is characterized by;
1. the wide notational, syntactic and semantic variety of contained expressions,
2. the syntactic function and structure given for each entry expression and
3. the possibility of internal modification indicated for each component word of the entry expression.
Content from these authors
© 2010 The Association for Natural Language Processing
Previous article Next article
feedback
Top