Journal of Natural Language Processing

Preface

[title in Japanese]

[in Japanese]

2019 Volume 26 Issue 4 Pages 659-661
Published: December 15, 2019
Released on J-STAGE: March 15, 2020

DOIhttps://doi.org/10.5715/jnlp.26.659

JOURNAL FREE ACCESS

Download PDF (170K)

General Paper

Construction and Analysis of Multiword Expression-aware Dependency Corpus

Akihiko Kato, Hiroyuki Shindo, Yuji Matsumoto

2019 Volume 26 Issue 4 Pages 663-688
Published: December 15, 2019
Released on J-STAGE: March 15, 2020

DOIhttps://doi.org/10.5715/jnlp.26.663

JOURNAL FREE ACCESS

Show abstractHide abstract

Multiword expressions (MWEs) consist of multiple words with syntactic or semantic non-compositionality. Natural Language Processing (NLP) tasks exploiting syntactic dependency information and requiring the understanding of the meaning of texts prefer the use of MWE-aware dependency trees (MWE-DTs)—where each MWE is a syntactic unit—to word-based dependency trees. To treat various continuous MWEs as syntactic units in dependency trees, this study conducts adjective MWE annotations on the OntoNotes corpus and constructs a dependency corpus that is aware of both the functional and adjective MWEs. In NLP tasks requiring a semantic understanding, it is also important to recognize verbal MWEs (VMWEs) such as phrasal verbs, which are likely to have discontinuous occurrences. Since dependency information can be used as an effective feature in VMWE recognition, this study examines the tasks to predict both MWE-DTs and VMWEs. For MWE-DTs, it explores the following three models: (a) a pipeline model of continuous MWE recognition (CMWER) and MWE-aware dependency parsing, (b) a model to predict a word-based dependency tree that encodes MWE spans as dependency labels (the head-initial dependency parser), and (c) the hierarchical multitask learning (HMTL) model of CMWER and the model in (b). The experimental results show that the pipeline and HMTL-based models show similar F1-scores in CMWER, which are 1.7 points better than the F1-score of the head-initial dependency parser. With respect to VMWE recognition, the results show an F1 improvement of 1.3 points by integrating the sequential labeler into the above mentioned HMTL-based model.

View full abstract

Download PDF (1281K)
Contextualized Multi-Sense Word Embedding

Kazuki Ashihara, Tomoyuki Kajiwara, Yuki Arase, Satoru Uchida

2019 Volume 26 Issue 4 Pages 689-710
Published: December 15, 2019
Released on J-STAGE: March 15, 2020

DOIhttps://doi.org/10.5715/jnlp.26.689

JOURNAL FREE ACCESS

Show abstractHide abstract

Currently, distributed word representations are employed in many natural language processing tasks. However, when generating one representation for each word, the meanings of a polysemous word cannot be differentiated because the meanings are integrated into one representation. Therefore, several attempts have been made to generate different representations per meaning based on parts of speech or the topic of a sentence. However, these methods are too unrefined to deal with polysemy. In this paper, we proposed two methods to generate more subtle multiple word representations. The first method involves generating multiple word representations using the word in a dependency relationship as a clue. The second approach involves employing a bi-directional language model in which a word representation that considers all the words in the context is generated. The results of the extensive evaluation of the Lexical Substitution task and Context-Aware Word Similarity task confirmed the effectiveness of our approaches to generate more subtle multiple word representations.

View full abstract

Download PDF (951K)
Word Rewarding Model Using a Bilingual Dictionary for Neural Machine Translations

Yuto Takebayashi, Chenhui Chu, Yuki Arase, Masaaki Nagata

2019 Volume 26 Issue 4 Pages 711-731
Published: December 15, 2019
Released on J-STAGE: March 15, 2020

DOIhttps://doi.org/10.5715/jnlp.26.711

JOURNAL FREE ACCESS

Show abstractHide abstract

Even though outputs of neural machine translation are more fluent compared to those of conventional phrase-based statistical machine translation, under and over generation are still major problems. While the translation quality of phrase-based statistical machine translation has improved due to the use of a bilingual dictionary by the decoder constraint, the same approach cannot be directly applied to neural machine translation. This paper proposes a rewarding model to apply the bilingual dictionary to neural machine translation. The proposed model first predicts the target words for the translation using the bilingual dictionary and then increases their decoder output probabilities at an inference. As the model uses the bilingual dictionary as an independent resource for the neural model, it can easily update or change the dictionary if required. The proposed model was found to improve translation quality even though it has less computational complexity than lexically constrained decoding that forces output of specified words. The results also confirmed that when combined with a method that biased the decoder to output dictionary entries using attention weights, the proposed method further improved the translation quality.

View full abstract

Download PDF (926K)

Register with J-STAGE for free!