Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 27, Issue 3
Displaying 1-16 of 16 articles from this issue
Preface
General Paper
  • Shohei Higashiyama, Masao Utiyama, Eiichiro Sumita, Masao Ideuchi, Yos ...
    2020 Volume 27 Issue 3 Pages 499-530
    Published: September 15, 2020
    Released on J-STAGE: December 15, 2020
    JOURNAL FREE ACCESS

    Although limited effort has been devoted to exploring neural models in Japanese word segmentation, much effort has been actively applied to Chinese word segmentation because of the ability to minimize effort in feature engineering. In this work, we propose a character-based neural model that makes joint use of word information useful for disambiguating word boundaries. For each character in a sentence, our model uses an attention mechanism to estimate the importance of multiple candidate words that contain the character. Experimental results show that learning attention to proper words leads to accurate segmentations and that our model achieves better performance than existing statistical and neural models on both in-domain and cross-domain Japanese word segmentation datasets.

    Download PDF (1320K)
  • Chunpeng Ma, Akihiro Tamura, Masao Utiyama, Tiejun Zhao, Eiichiro Sumi ...
    2020 Volume 27 Issue 3 Pages 531-552
    Published: September 15, 2020
    Released on J-STAGE: December 15, 2020
    JOURNAL FREE ACCESS

    The encoder-decoder attention matrix has been regarded as the (soft) alignment model for conventional neural machine translation (NMT) models such as RNN-based models. However, we show empirically that this is not true for the Transformer. By comparing the Transformer with the RNN-based NMT model, we find two inherent differences, and accordingly present two methods of capturing word alignments in the Transformer. Furthermore, instead of focusing on the Transformer, we present three axioms for the attention mechanism that captures word alignments, and propose a new attention mechanism based on these axioms that we have termed the axiomatic attention mechanism (AAM), and which is applicable to any NMT models. The AAM depends on a perturbation function, and we apply several perturbation functions to the AAM, including a novel function based on the masked language model (Devlin, Chang, Lee, and Toutanova 2019). Using the AAM to guide the training of an NMT model improved both the translation performance and the learning of word alignments of the NMT model. Our research sheds light on the interpretation of sequence-to-sequence models on neural machine translation.

    Download PDF (378K)
  • Hiroyuki Deguchi, Akihiro Tamura, Takashi Ninomiya
    2020 Volume 27 Issue 3 Pages 553-571
    Published: September 15, 2020
    Released on J-STAGE: December 15, 2020
    JOURNAL FREE ACCESS

    This paper proposes a new Transformer neural machine translation (NMT) model that incorporates dependency relations into self-attention on both the source and target sides, “dependency-based self-attention”. The dependency-based self-attention is trained to attend to the modifiee for each token under constraints based on dependency relations. This was inspired by linguistically-informed self-attention (LISA). LISA was originally designed for the Transformer encoder for semantic role labeling. However, this paper extends LISA to the Transformer NMT by masking future information on words in the decoder-side dependency-based self-attention. Further, our dependency-based self-attention operates with sub-word units created by byte pair encoding. In the experiments, our model achieved an increase of 1.04 and 0.30 points in the BLEU over the baseline model, respectively, on the Asian Scientific Paper Excerpt Corpus Japanese-to-English and English-to-Japanese translation tasks.

    Download PDF (714K)
  • Shohei Higashiyama, Masao Utiyama, Yuji Matsumoto, Taro Watanabe, Eiic ...
    2020 Volume 27 Issue 3 Pages 573-598
    Published: September 15, 2020
    Released on J-STAGE: December 15, 2020
    JOURNAL FREE ACCESS

    Recent work has explored various neural network-based methods for word segmentation and has achieved substantial progress mainly in in-domain scenarios. There remains, however, a problem of performance degradation on target domains for which labeled data is not available. A key issue in overcomming the problem is how to use linguistic resources in target domains, such as unlabeled data and lexicons, which can be collected or constructed more easily than fully-labeled data. In this work, we propose a novel method using unlabeled data and lexicons for cross-domain word segmentation. We introduce an auxiliary prediction task, Lexicon Word Prediction, into a character-based segmenter to identify occurrences of lexical entries in unlabeled sentences. The experiments demonstrate that the proposed method achieves accurate segmentation for various Japanese and Chinese domains.

    Download PDF (396K)
  • Hayate Iso, Yui Uehara, Tatsuya Ishigaki, Hiroshi Noji, Eiji Aramaki, ...
    2020 Volume 27 Issue 3 Pages 599-626
    Published: September 15, 2020
    Released on J-STAGE: December 15, 2020
    JOURNAL FREE ACCESS

    We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. Our tracking module selects and keeps track of salient information and remembers which record has been mentioned. Our generation module generates a summary conditioned on the state of tracking module. In addition, we also explore the effectiveness of the writer information for generation. Experimental results show that our model outperforms existing models in all evaluation metrics even without writer information. Incorporating writer information further improves the performance, contributing to content planning and surface realization.

    Download PDF (7601K)
  • Chenlong Hu, Mikio Nakano, Manabu Okumura
    2020 Volume 27 Issue 3 Pages 627-652
    Published: September 15, 2020
    Released on J-STAGE: December 15, 2020
    JOURNAL FREE ACCESS

    Mutual bootstrapping is a commonly used technique for many natural language processing tasks, including semantic lexicon induction. Among many bootstrapping methods, the Basilisk algorithm has led to successful applications through two key iterative steps: scoring context patterns and candidate instances. In this work, we improve Basilisk by modifying its two scoring functions. By incorporating AutoEncoder in the scoring functions of patterns and candidates, we can reduce the bias problems and obtain more balanced results. The experimental results demonstrate that our proposed methods for guiding the bootstrapping of a semantic lexicon with AutoEncoder can boost overall performance.

    Download PDF (451K)
Society Column
feedback
Top