Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 29, Issue 1
Displaying 1-19 of 19 articles from this issue
Preface (Non Peer-Reviewed)
General Paper (Peer-Reviewed)
  • Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura
    2022 Volume 29 Issue 1 Pages 3-22
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    As the performance of machine translation has improved, the need for a human-like automatic evaluation metric has been increasing. The use of multiple reference translations against a system translation (a hypothesis) has been adopted as a strategy to improve the performance of such evaluation metrics. However, preparing multiple references is highly expensive and impractical. In this study, we propose an automatic evaluation method for machine translation that uses source sentences as additional pseudo-references. The proposed method evaluates a translation hypothesis via regression to assign a real-valued score. The model takes the paired source, reference, and hypothesis sentences together as input. A pre-trained large-scale cross-lingual language model encodes the input to sentence vectors, with which the model predicts a human evaluation score. The results of experiments show that our proposed method exhibited stably higher correlation with human judgements than baseline methods that solely depend on hypothesis and reference sentences, especially when the hypotheses were very high- or low-quality translations.

    Download PDF (870K)
  • Yiran Wang, Hiroyuki Shindo, Yuji Matsumoto, Taro Watanabe
    2022 Volume 29 Issue 1 Pages 23-52
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    This paper presents a novel method for nested named entity recognition. As a layered method, our method extends the prior second-best path recognition method by explicitly excluding the influence of the best path. Our method maintains a set of hidden states at each time step and selectively leverages them to build a different potential function for recognition at each level. In addition, we demonstrate that recognizing innermost entities first results in better performance than the conventional outermost entities first scheme. We provide extensive experimental results on ACE2004, ACE2005, GENIA, and NNE datasets to show the effectiveness and efficiency of our proposed method.

    Download PDF (501K)
  • Shiki Sato, Reina Akama, Hiroki Ouchi, Jun Suzuki, Kentaro Inui
    2022 Volume 29 Issue 1 Pages 53-83
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    Developing an automatic evaluation framework for open-domain dialogue response generation systems that can validate the effects of daily system improvements at a low cost is necessary. However, existing metrics commonly used for automatic response generation evaluation, such as bilingual evaluation understudy (BLEU), correlate poorly with human evaluation. This poor correlation arises from the nature of dialogue, i.e., several acceptable responses to an input context. To address this issue, we focus on evaluating response generation systems via response selection. In this task, for a given context, systems select an appropriate response from a set of response candidates. Because the systems can only select specific candidates, evaluation via response selection can mitigate the effect of the above-mentioned nature of dialogue. Generally, false response candidates are randomly sampled from other unrelated dialogues, resulting in two issues: (a) unrelated false candidates and (b) acceptable utterances marked as false. General response selection test sets are unreliable owing to these issues. Thus, this paper proposes a method for constructing response selection test sets with well-chosen false candidates. Experiments demonstrate that evaluating systems via response selection with well-chosen false candidates correlates more strongly with human evaluation compared with commonly used automatic evaluation metrics such as BLEU.

    Download PDF (636K)
  • Junya Takayama, Tomoyuki Kajiwara, Yuki Arase
    2022 Volume 29 Issue 1 Pages 84-111
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    People often indirectly present their intentions in texts. For example, if a person said to an operator of a reservation service “I don’t have enough budget.”, it means “Please find a cheaper option for me.” While neural conversation models acquire the ability to generate fluent responses through training on a dialogue corpus, previous corpora did not focus on indirect responses. We create a large-scale dialogue corpus that provides pragmatic paraphrases to advance technology for understanding users’ underlying intentions. Our corpus provides a total of 71,498 indirect-direct utterance pairs accompanied by a multi-turn dialogue history extracted from the MultiWoZ dataset. Besides, we propose three tasks to benchmark the ability of models to recognize and generate indirect and direct utterances. We also investigate the performance of the state-of-the-art pre-trained language models as baselines. We confirmed that the performance of dialogue response generation was improved by transferring the indirect user utterances to direct ones.

    Download PDF (1037K)
  • Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okaza ...
    2022 Volume 29 Issue 1 Pages 112-143
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    We propose a novel method to find an appropriate tokenization for a given downstream model by jointly optimizing a tokenizer and the model. The proposed method has no restriction except for using loss values computed by the downstream model to train the tokenizer, and thus, we can apply the proposed method to various NLP task. Moreover, the proposed method can explore the appropriate tokenization to improve the performance for an already trained model as post-processing. Therefore, the proposed method is applicable to various situations. We evaluated whether our method contributes to improving performance on text classification in three languages and machine translation in eight language pairs. Experimental results show that our proposed method improves the performance by determining appropriate tokenizations.

    Download PDF (640K)
  • Ryuji Kano, Tomoki Taniguchi, Tomoko Ohkuma
    2022 Volume 29 Issue 1 Pages 144-165
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    Previous research of summarization models regards titles as summaries of source texts. However, much research has reported these training data are noisy. We propose an effective method of curriculum learning to train summarization models from noisy data. Curriculum learning is a method to improve performance by sorting training data based on difficulty or noisiness, and is effective to training models with noisy data. However, previous research never applied curriculum learning to summarization tasks. One aim of this research is to validate the effectiveness of curriculum learning to summarization tasks. In translation tasks, previous research quantified noise using two models trained with noisy and clean corpora. Because such corpora do not exist in summarization fields, it is difficult to apply this method to summarization tasks. Another aim of this research is to propose a model that can quantify noise using a single noisy corpus. The training task of the proposed model, Appropriateness Estimator is to distinguish correct source-summary pairs of from randomly assigned pairs. Throughout the training, the model learns to compute the appropriateness of source-summary pairs. We conduct experiments on three summarization models and verify curriculum learning and our method improves the performance.

    Download PDF (510K)
  • Abdurrisyad Fikri, Hiep V. Le, Takashi Miyazaki, Manabu Okumura, Nobuy ...
    2022 Volume 29 Issue 1 Pages 166-186
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    To build good conversation agents, an accurate conversation context is assumed to be required. We argue that a conversation scene that includes speakers can provide more information on the context because using images as conversation contexts has proven effective. We constructed a visual conversation scene dataset (VCSD) that provided scenic images corresponding to conversations. This dataset provides a combination of (1) conversation scene image (third-person view), (2) the corresponding first utterance and its response, and (3) the corresponding speaker, respondent, and topic object. In our experiments on the response-selection task, we first examined BERT (text only) as a baseline. Although BERT managed to perform well in general conversations, where a response continued from the previous utterance, it failed to deal with cases where visual information was necessary to understand the context. Our error analysis found that conversations requiring visual contexts can be categorized into three types: visual question-answering, image-referring response, and scene understanding. To optimize the usage of conversation scene images and their focused parts, that is, speaker, respondent, and topic object, we proposed a model that received texts and multiple image features as inputs. Our model can capture this information and achieve 91% accuracy.

    Download PDF (477K)
  • Youmi Ma, Tatsuya Hiraoka, Naoaki Okazaki
    2022 Volume 29 Issue 1 Pages 187-223
    Published: 2022
    Released on J-STAGE: March 15, 2022
    JOURNAL FREE ACCESS

    In this study, we propose a method designed to extract named entities and relations from unstructured text based on table representations. To extract named entities, the proposed method computes representations for entity mentions and long-range dependencies using contextualized representations without hand-crafted features or complex neural network architectures. To extract relations, it applies a tensor dot product to predict all relation labels simultaneously without considering dependencies among relation labels. These advancements significantly simplify the proposed model and the associated algorithm for the extraction of named entities and relations. Despite its simplicity, the experimental results demonstrate that the proposed approach outperformed the state of the-art methods on multiple datasets. Compared with existing table-filling approaches, the proposed method achieved high performance solely by independently predicting the relation labels. In addition, we found that incorporating dependencies of relation labels into the system obtained little performance gain, indicating the effectiveness and sufficiency of the tensor dot-product mechanism for relation extraction in the proposed architecture. Experimental analyses were also performed to explore the benefits of joint training with named entity recognition in relation extraction in our design. We concluded that joint training with named entity recognition assists relation extraction to improve the span-level representation of entities.

    Download PDF (917K)
Society Column (Non Peer-Reviewed)
Information (Non Peer-Reviewed)
feedback
Top