Journal of Natural Language Processing

Preface

[title in Japanese]

[in Japanese]

2018 Volume 25 Issue 5 Pages 485-486
Published: December 15, 2018
Released on J-STAGE: March 15, 2019

DOIhttps://doi.org/10.5715/jnlp.25.485

JOURNAL FREE ACCESS

Download PDF (128K)

Paper

Improved BTG-based Preordering for SMT via Parallel Parameter Averaging: An Empirical Study

Hao Wang, Yves Lepage

2018 Volume 25 Issue 5 Pages 487-509
Published: December 15, 2018
Released on J-STAGE: March 15, 2019

DOIhttps://doi.org/10.5715/jnlp.25.487

JOURNAL FREE ACCESS

Show abstractHide abstract

Preordering has proven useful in improving the translation quality of statistical machine translation (SMT), especially for language pairs with different syntax. The top-down bracketing transduction grammar (BTG)-based preordering method (Nakagawa 2015) has achieved a state-of-the-art performance since it relies on aligned parallel text only and deos not require any linguistic annotations. Although this online learning algorithm adopted is efficient and effective, it is very susceptible to alignment errors. In a production environment, in particular, such a preorderer is commonly trained on noisy word alignments obtained using an automatic word aligner, resulting in a worse performance compared to those trained on manually annotated datasets. In order to achieve better preordering using automatically aligned datasets, this paper seeks to improve the top-down BTG-based preordering method using various parameter mixing techniques to increase the accuracy of the preorderer and speed up training via parallelisation. The parameters mixing methods and the original online training method (Nakagawa 2015) were empirically compared, and the experimental results show that such parallel parameter averaging methods can dramatically reduce the training time and improve the quality of preordering.

View full abstract

Download PDF (797K)
Replacement of Unknown Words Using an Attention Model in Japanese to English Neural Machine Translation

Saki Ibe, Yoshitatsu Matsuda, Kazunori Yamaguchi

2018 Volume 25 Issue 5 Pages 511-525
Published: December 15, 2018
Released on J-STAGE: March 15, 2019

DOIhttps://doi.org/10.5715/jnlp.25.511

JOURNAL FREE ACCESS

Show abstractHide abstract

It is well known that machine translation using recurrent neural networks often composes fluent sentences but may include many unknown words. Although there have been many works to address the unknown word problem, they are ineffective in Japanese to English translation. In this study, we propose a hybrid method that makes an alignment table using an attention weight matrix, detects input words that are aligned with each unknown words, and finally replaces those unknown words with the translated words using a statistical machine translation method. We evaluate our approach by using two corpora: ASPEC and NTCIR-10. The results showed that the proposed method generated no unknown words and improved the BLEU (BiLingual Evaluation Understudy) score.

View full abstract

Download PDF (484K)
Between Reading Time and the Information Status of Noun Phrases

Masayuki Asahara

2018 Volume 25 Issue 5 Pages 527-554
Published: December 15, 2018
Released on J-STAGE: March 15, 2019

DOIhttps://doi.org/10.5715/jnlp.25.527

JOURNAL FREE ACCESS

Show abstractHide abstract

Japanese noun phrases are not marked by articles. The information status of Japanese noun phrases is not overt. Due to limited contextual information and world knowledge, it is difficult to estimate the information status, which is analyzed through the given/new status or indefinite/definite status. However, in Japanese language processing, the notion of the information status is yet to be understood. In this paper, we explain the information status of Japanese noun phrases. Then, we explore how the information status of Japanese noun phrases is estimated through the reading time. As a first step, we investigate the correlation between reading time and the information status of Japanese noun phrases. The statistical evaluation shows that readers’ information status affects reading time in Japanese.

View full abstract

Download PDF (1316K)
A Reference-less Evaluation Metric Based on Grammaticality, Fluency, and Meaning Preservation in Grammatical Error Correction

Hiroki Asano, Tomoya Mizumoto, Kentaro Inui

2018 Volume 25 Issue 5 Pages 555-576
Published: December 15, 2018
Released on J-STAGE: March 15, 2019

DOIhttps://doi.org/10.5715/jnlp.25.555

JOURNAL FREE ACCESS

Show abstractHide abstract

In grammatical error correction (GEC), the automatic evaluation of system performance is thought to be an essential driving force. Previous methods for automated system assessment require gold-standard references, which have to be created manually and thus tend to be both expensive and limited in coverage. To address this problem, a reference-less approach has recently emerged; however, previous reference-less metrics, which only consider the grammaticality of system outputs, have not performed as well as reference-based metrics. In this study, we explore the potential of extending a prior grammaticality-based method to establish a reference-less evaluation method for GEC systems. We empirically show that a reference-less metric that combines both fluency and meaning preservation with grammaticality provides a better estimate of manual scores than that of commonly used reference-based metrics. Additionally, we show that the reference-less metric can provide appropriate evaluation at the sentence-level and that it can be applied to GEC systems.

View full abstract

Download PDF (721K)
Detecting Untranslated Content for Neural Machine Translation

Isao Goto, Hideki Tanaka

2018 Volume 25 Issue 5 Pages 577-597
Published: December 15, 2018
Released on J-STAGE: March 15, 2019

DOIhttps://doi.org/10.5715/jnlp.25.577

JOURNAL FREE ACCESS

Show abstractHide abstract

Despite its promise, neural machine translation (NMT) presents a serious problem in that source content may be mistakenly left untranslated. The ability to detect untranslated content is important for the practical use of NMT. We evaluated two types of probability with which to identify untranslated content: the cumulative attention probability and the back translation probability from a target sentence to the source sentence. Experiments were conducted to discover missing content in Japanese to English patent translations. The results of the investigation revealed that both the types of probability were each effective, back translation was more effective than attention, and the combination of the two resulted in further improvements. Furthermore, we confirmed that the detection of untranslated content was effectual in terms of sentence selection for the human post-editing processing of machine translation results.

View full abstract

Download PDF (935K)
Syntactic Matching Methods in Pivot Translation

Akiva Miura, Graham Neubig, Katsuhito Sudoh, Satoshi Nakamura

2018 Volume 25 Issue 5 Pages 599-629
Published: December 15, 2018
Released on J-STAGE: March 15, 2019

DOIhttps://doi.org/10.5715/jnlp.25.599

JOURNAL FREE ACCESS

Show abstractHide abstract

The pivot translation is useful method for translating between languages that contain little or no parallel data by utilizing equivalents in an intermediate language such as English. Commonly, phrase-based or tree-based pivot translation methods merge source–pivot and pivot–target translation models into a source–target model. This tactic is known as triangulation. However, the combination is based on the surface forms of constituent words, and it often produces incorrect source–target phrase pairs because of interlingual differences and semantic ambiguities in the pivot language. The translation accuracy is thus degraded. This paper proposes a triangulation approach that utilizes syntactic subtrees in the pivot language to avoid incorrect phrase combinations by distinguishing pivot language words by their syntactic roles. The results of the experiments conducted on the United Nations Parallel Corpus demonstrate that the proposed method is superior to other pivot translation approaches in all tested combinations of languages.

View full abstract

Download PDF (331K)

Register with J-STAGE for free!