Journal of Natural Language Processing

Preface

[title in Japanese]

[in Japanese]

2021 Volume 28 Issue 3 Pages 743-744
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.743

JOURNAL FREE ACCESS

Download PDF (130K)

Obituary

[title in Japanese]

[in Japanese]

2021 Volume 28 Issue 3 Pages 745-746
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.745

JOURNAL FREE ACCESS

Download PDF (171K)
[title in Japanese]

[in Japanese]

2021 Volume 28 Issue 3 Pages 747-748
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.747

JOURNAL FREE ACCESS

Download PDF (131K)
[title in Japanese]

[in Japanese]

2021 Volume 28 Issue 3 Pages 749-750
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.749

JOURNAL FREE ACCESS

Download PDF (181K)

General Paper

Neural Text Generation with Artificial Negative Examples to Address Repeating and Dropping Errors

Keisuke Shirai, Kazuma Hashimoto, Akiko Eriguchi, Takashi Ninomiya, Sh ...

2021 Volume 28 Issue 3 Pages 751-777
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.751

JOURNAL FREE ACCESS

Show abstractHide abstract

Neural text generation models that are conditioned on a given input (e.g., machine translation and image captioning) are typically trained through maximum likelihood estimation of the target text. However, models trained in this manner often suffer from various types of errors when making subsequent inferences. In this study, we propose suppressing an arbitrary type of error by training the text generation model in a reinforcement learning framework; herein, we use a trainable reward function that can discriminate between references and sentences, containing the targeted type of errors. We create such negative examples by artificially injecting the targeted errors into the references. In the experiments, we focus on two error types; repeated and dropped tokens in model-generated text. The experimental results demonstrate that our method can suppress generation errors, and achieves significant improvements on two machine translation and two image captioning tasks.

View full abstract

Download PDF (370K)
Length-constrained Neural Machine Translation using Length Prediction and Perturbation into Length-aware Positional Encoding

Yui Oka, Katsuhito Sudoh, Satoshi Nakamura

2021 Volume 28 Issue 3 Pages 778-801
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.778

JOURNAL FREE ACCESS

Show abstractHide abstract

Neural machine translation often suffers from an under-translation problem owing to its limited modeling of the output sequence lengths. In this study, we propose a novel approach to training a Transformer model using length constraints based on length-aware positional encoding (PE). Because length constraints with exact target sentence lengths degrade the translation performance, we add a random perturbation with a uniform distribution within a certain range to the length constraints in the PE during the training. In the inference step, we predicted the output lengths from the input sequences using a length prediction model based on a large-scale pre-trained language model. In Japanese-to-English and English-to-Japanese translation, experimental results show that the proposed perturbation injection improves the robustness of the length prediction errors, particularly within a certain range.

View full abstract

Download PDF (274K)
Improved Method for Organizing Information Contained in Multiple Documents into a Table

Masaki Murata, Kensuke Okazaki, Qing Ma

2021 Volume 28 Issue 3 Pages 802-823
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.802

JOURNAL FREE ACCESS

Show abstractHide abstract

Okazaki et al. (2018) have proposed a method for organizing the information contained in multiple documents into a table without limiting the information to be extracted. In this study, we propose a method for improving the accuracy of these tables. In our proposed method, information is first clustered hierarchically. Next, for the results of hierarchical clustering (with the number of clusters ranging from 1 to n), the degree of filling and the information density of the resulting table are calculated. The number of clusters when the balance between these two indicators is optimal is chosen as the optimal number of clusters. The results of the method using the chosen number of clusters are organized into a table. In the conventional method, the number of clusters estimated by the X-means method tends to be too small. As demonstrated by the results of experiments using 15 types of multiple documents, the proposed method improves this problem, with its estimated number of clusters being closer to the optimum. The average evaluation result in the tables (F-measure) when applying the conventional method was 0.43; the proposed method improves this to 0.65. We therefore confirm the effectiveness of the proposed method.

View full abstract

Download PDF (155K)
Automatic Speech Recognition for the Archive of Ainu Folklores

Kohei Matsuura, Masato Mimura, Tatsuya Kawahara

2021 Volume 28 Issue 3 Pages 824-846
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.824

JOURNAL FREE ACCESS

Show abstractHide abstract

In this article, our work on the speech recognition of Ainu folklores (Uwepeker) is described. First, we constructed an Ainu speech corpus for the Saru dialect based on the data provided by two museums that had constructed the Ainu archive. Next, we built an automatic speech recognition (ASR) system based on an attention-based encoder-decoder model, and compared four recognition units of phones, syllables, word pieces, and words. With the syllable unit, we achieved a phone recognition accuracy of 93.7% and 86.2%, and word recognition accuracy of 78.3% and 61.4% for the speaker-closed and speaker-open conditions, respectively. To address the problem of significant degradation in the speaker-open condition, an unsupervised speaker adaptation method using a CycleGAN is proposed. In this method, mapping of the speaker’s voice in the training data to the target speaker’s voice is learned by a CycleGAN, that converts all speech in the training data into the target speaker’s speech. This method reduced the phone error rate by up to 60.6%. In addition, we investigated language identification in Japanese and Ainu mixed speech and realized reasonable performance by cascading phone and word recognition modules.

View full abstract

Download PDF (562K)
Japanese Chess Commentary Corpus with Named Entity and Modality Annotation

Hirotaka Kameko, Suguru Matsuyoshi, John Richardson, Atsushi Ushiku, T ...

2021 Volume 28 Issue 3 Pages 847-873
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.847

JOURNAL FREE ACCESS

Show abstractHide abstract

In recent years, there has been a surge of interest in the natural language processing related to the real world, such as symbol grounding, language generation, and nonlinguistic data search using natural language queries. We argue that shogi (Japanese chess) commentaries, which are accompanied by game states, are an interesting testbed for these tasks. A commentator refers not only to the current board state but also to the past and future moves, and yet such references can be grounded in the game tree, possibly with the help of modern game-tree search algorithms. In this paper, we build a shogi commentary corpus and augment it with a manual annotation of word segmentation, named entities, modality expressions, and event factuality. This corpus can be used to train a computer to identify words and phrases that signal factuality and to determine events with the said factuality, paving the way for grounding possible and counterfactual states.

View full abstract

Download PDF (642K)

Society Column

Towards AI Systems That Can Explain with Language

Kentaro Inui, Daisuke Bekki, Sadao Kurohashi, Minao Kukita

2021 Volume 28 Issue 3 Pages 874-880
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.874

JOURNAL FREE ACCESS

Download PDF (826K)
When Creative AI Meets Conversational AI

Xianchao Wu

2021 Volume 28 Issue 3 Pages 881-887
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.881

JOURNAL FREE ACCESS

Download PDF (193K)
Live Competition: AI King—Quiz AI Japan Championship—

Jun Suzuki, Koji Matshuda, Masatoshi Suzuki, Takuma Kato, Shumpei Miya ...

2021 Volume 28 Issue 3 Pages 888-894
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.888

JOURNAL FREE ACCESS

Download PDF (329K)
NLP2021 Workshop: Evaluation and Quality Estimation of Text—How Do We Judge Human- and Machine-generated Text Good or Bad?

Katsuhito Sudoh, Mamoru Komachi, Tomoyuki Kajiwara

2021 Volume 28 Issue 3 Pages 895-900
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.895

JOURNAL FREE ACCESS

Download PDF (264K)
Key Lessons from Workshop on New Normal Communication for Young Researchers

Naoya Inoue

2021 Volume 28 Issue 3 Pages 901-906
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.901

JOURNAL FREE ACCESS

Download PDF (275K)
WRIME: A Japanese Dataset for Emotional Intensity Estimation with Subjective and Objective Annotations

Tomoyuki Kajiwara

2021 Volume 28 Issue 3 Pages 907-912
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.907

JOURNAL FREE ACCESS

Download PDF (240K)
Contextualized and Generalized Sentence Representations by Contrastive Self-Supervised Learning: A Case Study on Discourse Relation Analysis

Hirokazu Kiyomaru

2021 Volume 28 Issue 3 Pages 913-917
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.913

JOURNAL FREE ACCESS

Download PDF (277K)
Exploring Advantages of Existing Methods through Various Experiments—The Quest of Efficient Perturbations on Sequence-to-Sequence Problems—

Sho Takase

2021 Volume 28 Issue 3 Pages 918-923
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.918

JOURNAL FREE ACCESS

Download PDF (252K)
End-to-end ASR to jointly predict transcriptions and linguistic annotations

Motoi Omachi

2021 Volume 28 Issue 3 Pages 924-929
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.924

JOURNAL FREE ACCESS

Download PDF (374K)

Information

[title in Japanese]

2021 Volume 28 Issue 3 Pages 930-935
Published: 2021
Released on J-STAGE: September 15, 2021

DOIhttps://doi.org/10.5715/jnlp.28.930

JOURNAL FREE ACCESS

Download PDF (353K)

Register with J-STAGE for free!