Journal of Natural Language Processing

Preface (Non Peer-Reviewed)

[title in Japanese]

[in Japanese]

2022 Volume 29 Issue 1 Pages 1-2
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.1

JOURNAL FREE ACCESS

Download PDF (122K)

General Paper (Peer-Reviewed)

Automatic Machine Translation Evaluation using a Source and Reference Sentence with a Cross-lingual Language Model

Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

2022 Volume 29 Issue 1 Pages 3-22
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.3

JOURNAL FREE ACCESS

Show abstractHide abstract

As the performance of machine translation has improved, the need for a human-like automatic evaluation metric has been increasing. The use of multiple reference translations against a system translation (a hypothesis) has been adopted as a strategy to improve the performance of such evaluation metrics. However, preparing multiple references is highly expensive and impractical. In this study, we propose an automatic evaluation method for machine translation that uses source sentences as additional pseudo-references. The proposed method evaluates a translation hypothesis via regression to assign a real-valued score. The model takes the paired source, reference, and hypothesis sentences together as input. A pre-trained large-scale cross-lingual language model encodes the input to sentence vectors, with which the model predicts a human evaluation score. The results of experiments show that our proposed method exhibited stably higher correlation with human judgements than baseline methods that solely depend on hypothesis and reference sentences, especially when the hypotheses were very high- or low-quality translations.

View full abstract

Download PDF (870K)
Nested Named Entity Recognition via Explicitly Excluding the Influence of the Best Path

Yiran Wang, Hiroyuki Shindo, Yuji Matsumoto, Taro Watanabe

2022 Volume 29 Issue 1 Pages 23-52
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.23

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a novel method for nested named entity recognition. As a layered method, our method extends the prior second-best path recognition method by explicitly excluding the influence of the best path. Our method maintains a set of hidden states at each time step and selectively leverages them to build a different potential function for recognition at each level. In addition, we demonstrate that recognizing innermost entities first results in better performance than the conventional outermost entities first scheme. We provide extensive experimental results on ACE2004, ACE2005, GENIA, and NNE datasets to show the effectiveness and efficiency of our proposed method.

View full abstract

Download PDF (501K)
Evaluating Dialogue Response Generation Systems via Response Selection with Well-chosen False Candidates

Shiki Sato, Reina Akama, Hiroki Ouchi, Jun Suzuki, Kentaro Inui

2022 Volume 29 Issue 1 Pages 53-83
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.53

JOURNAL FREE ACCESS

Show abstractHide abstract

Developing an automatic evaluation framework for open-domain dialogue response generation systems that can validate the effects of daily system improvements at a low cost is necessary. However, existing metrics commonly used for automatic response generation evaluation, such as bilingual evaluation understudy (BLEU), correlate poorly with human evaluation. This poor correlation arises from the nature of dialogue, i.e., several acceptable responses to an input context. To address this issue, we focus on evaluating response generation systems via response selection. In this task, for a given context, systems select an appropriate response from a set of response candidates. Because the systems can only select specific candidates, evaluation via response selection can mitigate the effect of the above-mentioned nature of dialogue. Generally, false response candidates are randomly sampled from other unrelated dialogues, resulting in two issues: (a) unrelated false candidates and (b) acceptable utterances marked as false. General response selection test sets are unreliable owing to these issues. Thus, this paper proposes a method for constructing response selection test sets with well-chosen false candidates. Experiments demonstrate that evaluating systems via response selection with well-chosen false candidates correlates more strongly with human evaluation compared with commonly used automatic evaluation metrics such as BLEU.

View full abstract

Download PDF (636K)
Construction and Analysis of a Dialogue Corpus with Paraphrase Pairs of Indirect and Direct Responses

Junya Takayama, Tomoyuki Kajiwara, Yuki Arase

2022 Volume 29 Issue 1 Pages 84-111
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.84

JOURNAL FREE ACCESS

Show abstractHide abstract

People often indirectly present their intentions in texts. For example, if a person said to an operator of a reservation service “I don’t have enough budget.”, it means “Please find a cheaper option for me.” While neural conversation models acquire the ability to generate fluent responses through training on a dialogue corpus, previous corpora did not focus on indirect responses. We create a large-scale dialogue corpus that provides pragmatic paraphrases to advance technology for understanding users’ underlying intentions. Our corpus provides a total of 71,498 indirect-direct utterance pairs accompanied by a multi-turn dialogue history extracted from the MultiWoZ dataset. Besides, we propose three tasks to benchmark the ability of models to recognize and generate indirect and direct utterances. We also investigate the performance of the state-of-the-art pre-trained language models as baselines. We confirmed that the performance of dialogue response generation was improved by transferring the indirect user utterances to direct ones.

View full abstract

Download PDF (1037K)
Joint Optimization of Word Segmentation and Downstream Model using Downstream Loss

Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okaza ...

2022 Volume 29 Issue 1 Pages 112-143
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.112

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a novel method to find an appropriate tokenization for a given downstream model by jointly optimizing a tokenizer and the model. The proposed method has no restriction except for using loss values computed by the downstream model to train the tokenizer, and thus, we can apply the proposed method to various NLP task. Moreover, the proposed method can explore the appropriate tokenization to improve the performance for an already trained model as post-processing. Therefore, the proposed method is applicable to various situations. We evaluated whether our method contributes to improving performance on text classification in three languages and machine translation in eight language pairs. Experimental results show that our proposed method improves the performance by determining appropriate tokenizations.

View full abstract

Download PDF (640K)
Quantifying Appropriateness of Summarization Data for Curriculum Learning

Ryuji Kano, Tomoki Taniguchi, Tomoko Ohkuma

2022 Volume 29 Issue 1 Pages 144-165
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.144

JOURNAL FREE ACCESS

Show abstractHide abstract

Previous research of summarization models regards titles as summaries of source texts. However, much research has reported these training data are noisy. We propose an effective method of curriculum learning to train summarization models from noisy data. Curriculum learning is a method to improve performance by sorting training data based on difficulty or noisiness, and is effective to training models with noisy data. However, previous research never applied curriculum learning to summarization tasks. One aim of this research is to validate the effectiveness of curriculum learning to summarization tasks. In translation tasks, previous research quantified noise using two models trained with noisy and clean corpora. Because such corpora do not exist in summarization fields, it is difficult to apply this method to summarization tasks. Another aim of this research is to propose a model that can quantify noise using a single noisy corpus. The training task of the proposed model, Appropriateness Estimator is to distinguish correct source-summary pairs of from randomly assigned pairs. Throughout the training, the model learns to compute the appropriateness of source-summary pairs. We conduct experiments on three summarization models and verify curriculum learning and our method improves the performance.

View full abstract

Download PDF (510K)
Improving Conversation Task with Visual Scene Dataset

Abdurrisyad Fikri, Hiep V. Le, Takashi Miyazaki, Manabu Okumura, Nobuy ...

2022 Volume 29 Issue 1 Pages 166-186
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.166

JOURNAL FREE ACCESS

Show abstractHide abstract

To build good conversation agents, an accurate conversation context is assumed to be required. We argue that a conversation scene that includes speakers can provide more information on the context because using images as conversation contexts has proven effective. We constructed a visual conversation scene dataset (VCSD) that provided scenic images corresponding to conversations. This dataset provides a combination of (1) conversation scene image (third-person view), (2) the corresponding first utterance and its response, and (3) the corresponding speaker, respondent, and topic object. In our experiments on the response-selection task, we first examined BERT (text only) as a baseline. Although BERT managed to perform well in general conversations, where a response continued from the previous utterance, it failed to deal with cases where visual information was necessary to understand the context. Our error analysis found that conversations requiring visual contexts can be categorized into three types: visual question-answering, image-referring response, and scene understanding. To optimize the usage of conversation scene images and their focused parts, that is, speaker, respondent, and topic object, we proposed a model that received texts and multiple image features as inputs. Our model can capture this information and achieve 91% accuracy.

View full abstract

Download PDF (477K)
Named Entity Recognition and Relation Extraction Using Enhanced Table Filling by Contextualized Representations

Youmi Ma, Tatsuya Hiraoka, Naoaki Okazaki

2022 Volume 29 Issue 1 Pages 187-223
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.187

JOURNAL FREE ACCESS

Show abstractHide abstract

In this study, we propose a method designed to extract named entities and relations from unstructured text based on table representations. To extract named entities, the proposed method computes representations for entity mentions and long-range dependencies using contextualized representations without hand-crafted features or complex neural network architectures. To extract relations, it applies a tensor dot product to predict all relation labels simultaneously without considering dependencies among relation labels. These advancements significantly simplify the proposed model and the associated algorithm for the extraction of named entities and relations. Despite its simplicity, the experimental results demonstrate that the proposed approach outperformed the state of the-art methods on multiple datasets. Compared with existing table-filling approaches, the proposed method achieved high performance solely by independently predicting the relation labels. In addition, we found that incorporating dependencies of relation labels into the system obtained little performance gain, indicating the effectiveness and sufficiency of the tensor dot-product mechanism for relation extraction in the proposed architecture. Experimental analyses were also performed to explore the benefits of joint training with named entity recognition in relation extraction in our design. We concluded that joint training with named entity recognition assists relation extraction to improve the span-level representation of entities.

View full abstract

Download PDF (917K)

Society Column (Non Peer-Reviewed)

The Construction of the full version of the Corpus of Everyday Japanese Conversation

Hanae Koiso

2022 Volume 29 Issue 1 Pages 224-229
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.224

JOURNAL FREE ACCESS

Download PDF (369K)
ACL Rolling Review: A New Format For Centralized Peer Review

Raj Dabre

2022 Volume 29 Issue 1 Pages 230-236
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.230

JOURNAL FREE ACCESS

Download PDF (120K)
Summarize-then-Answer: Generating Concise Explanations for Multi-hop Reading Comprehension

Naoya Inoue

2022 Volume 29 Issue 1 Pages 237-242
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.237

JOURNAL FREE ACCESS

Download PDF (493K)
Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution

Ryuto Konno

2022 Volume 29 Issue 1 Pages 243-247
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.243

JOURNAL FREE ACCESS

Download PDF (342K)
What I have learned from the research of “SHAPE: Shifted Absolute Position Embedding for Transformers”

Shun Kiyono

2022 Volume 29 Issue 1 Pages 248-252
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.248

JOURNAL FREE ACCESS

Download PDF (280K)
Modeling Human Sentence Processing with Left-Corner Recurrent Neural Network Grammars

Ryo Yoshida

2022 Volume 29 Issue 1 Pages 253-258
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.253

JOURNAL FREE ACCESS

Download PDF (236K)
Low-resouce Taxonomy Enrichment with Pretrained Language Models

Kunihiro Takeoka

2022 Volume 29 Issue 1 Pages 259-263
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.259

JOURNAL FREE ACCESS

Download PDF (253K)
Convex Aggregation for Opinion Summarization

Hayate Iso

2022 Volume 29 Issue 1 Pages 264-269
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.264

JOURNAL FREE ACCESS

Download PDF (471K)
Exploring Methods for Generating Feedback Comments for Writing Learning

Kazuaki Hanawa

2022 Volume 29 Issue 1 Pages 270-274
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.270

JOURNAL FREE ACCESS

Download PDF (194K)

Information (Non Peer-Reviewed)

[title in Japanese]

2022 Volume 29 Issue 1 Pages 275-291
Published: 2022
Released on J-STAGE: March 15, 2022

DOIhttps://doi.org/10.5715/jnlp.29.275

JOURNAL FREE ACCESS

Download PDF (508K)

Register with J-STAGE for free!