Journal of Natural Language Processing

Preface (Non Peer-Reviewed)

[title in Japanese]

[in Japanese]

2025 Volume 32 Issue 2 Pages 402-403
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.402

JOURNAL FREE ACCESS

Download PDF (164K)

General Paper (Peer-Reviewed)

End-to-end Simultaneous Speech Translation with Style Tags using Human Simultaneous Interpretation Data

Yuka Ko, Ryo Fukuda, Yuta Nishikawa, Yasumasa Kano, Katsuhito Sudoh, S ...

2025 Volume 32 Issue 2 Pages 404-437
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.404

JOURNAL FREE ACCESS

Show abstractHide abstract

Simultaneous speech translation (SimulST) translates speech incrementally, requiring a monotonic input-output correspondence to reduce latency. This is particularly challenging for distant language pairs, such as English and Japanese, as most SimulST models are trained using offline speech translation (ST) data, where the entire speech input is observed during translation. In simultaneous interpretation (SI), a simultaneous interpreter translates source language speech into target language speech without waiting for the speaker to finish speaking. Therefore, the SimulST model can learn SI-style translations using SI data. However, owing to the limited availability of SI data, fine-tuning an offline ST model using SI data may result in overfitting. To address this problem, we propose an efficient training method for the speech-to-text SimulST model using a combination of small SI and relatively large offline ST data. We trained a single model with mixed data by incorporating style tags to instruct the model to generate either SI or offline-style outputs. This approach, called mixed fine-tuning with style tags, can be extended further using the multistage self-training approach. In this case, we use the trained model to generate pseudo-SI data. Our experimental results for several test sets demonstrated that our models trained using mixed fine-tuning and multistage self-training outperformed baselines across various latency ranges.

View full abstract

Download PDF (1058K)
Enhancing Automated Essay Scoring with Grammatical Features using Multi-task Learning and Item Response Theory

Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura, Taro Watanabe

2025 Volume 32 Issue 2 Pages 438-479
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.438

JOURNAL FREE ACCESS

Show abstractHide abstract

In foreign language learning, writing tasks play a crucial role in developing and assessing learners’ language abilities, but manual scoring requires significant time and effort. Automated essay scoring (AES) is a way to mitigate this problem. Although human raters consider grammatical items and their difficulties as clues for judging learners’ proficiency levels while scoring essays, it is unclear whether the current state-of-the-art AES models, which use BERT-based essay representations, consider these factors. In this paper, we propose to incorporate grammatical features into BERT-based AES models in three ways: (1) using grammatical features as additional model inputs, (2) performing multi-task learning (MTL) with holistic and grammar scores while using grammatical features as model inputs, and (3) reconstructing grammatical features through MTL with holistic scores. For grammatical features, we model learners’ grammar usage using item response theory (IRT), which measures learners’ grammar abilities and characteristics of grammatical items, including their difficulties, based on essay data without teacher labels. The experimental results show that grammatical features improve the scoring performance, and further improvements are brought by MTL with holistic and grammar scores. We also show that weighting grammatical items using IRT-estimated difficulties improve the scoring performance, and IRT-estimated grammar abilities can be used for the labels of MTL.

View full abstract

Download PDF (760K)
Likelihood-based Mitigation of Evaluation Bias in Large Language Models

Masanari Ohi, Masahiro Kaneko, Ryuto Koike, Mengsay Loem, Naoaki Okaza ...

2025 Volume 32 Issue 2 Pages 480-496
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.480

JOURNAL FREE ACCESS

Show abstractHide abstract

Large Language Models (LLMs) are widely used to evaluate natural language generation tasks as automated metrics．However, the likelihood, a measure of LLM’s plausibility for a sentence, can vary due to superficial differences in sentences, such as word order and sentence structure．It is therefore possible that there might be a likelihood bias if LLMs are used for evaluation: they might overrate sentences with higher likelihoods while underrating those with lower likelihoods．In this paper, we investigate the presence and impact of likelihood bias in LLM-based evaluators．We also propose a method to mitigate the likelihood bias．Our method utilizes highly biased instances as few-shot examples for in-context learning．Our experiments in evaluating the data-to-text and grammatical error correction tasks reveal that several LLMs we test display a likelihood bias．Furthermore, our proposed method successfully mitigates this bias, also improving evaluation performance (in terms of correlation of models with human scores) significantly．

View full abstract

Download PDF (508K)
A Benchmark Suite of Japanese Natural Questions

Takuya Uematsu, Hao Wang, So Fukuda, Daisuke Kawahara, Tomohide Shibat ...

2025 Volume 32 Issue 2 Pages 497-519
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.497

JOURNAL FREE ACCESS

Show abstractHide abstract

To develop high-performance and robust natural language processing (NLP) models, it is important to have various question answering (QA) datasets to train, evaluate, and analyze them. Although there are various QA datasets available in English, there are only a few QA datasets in other languages. We focus on Japanese, a language with only a few basic QA datasets, and aim to build a Japanese version of Natural Questions (NQ), JNQ, consisting of questions that naturally arise from human information needs. We collect natural questions from query logs of a Japanese search engine and build the dataset using crowdsourcing. Furthermore, we construct a Japanese version of BoolQ, JBoolQ, which is derived from NQ and consists of yes/no questions. We also re-define the dataset specification of the original NQ/BoolQ to construct JNQ/JBoolQ. JNQ consists of 16,641 questions, and JBoolQ consists of 6,467 questions. We also define three tasks from JNQ and one from JBoolQ and establish baselines using competitive methods drawn from related literature. We hope that these datasets will facilitate research on QA and NLP models in Japanese. We will make JNQ and JBoolQ publicly available.

View full abstract

Download PDF (776K)
Toward Enhancing Reasoning Capabilities of LLMs: An Approach via Synthetic Logic Corpus

Terufumi Morishita, Gaku Morio, Atsuki Yamaguchi, Yasuhiro Sogawa

2025 Volume 32 Issue 2 Pages 520-571
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.520

JOURNAL FREE ACCESS

Show abstractHide abstract

Large language models (LLMs) are capable of solving a wide range of tasks, yet they have struggled with reasoning. To address this, we propose Additional Logic Training (ALT), which aims to enhance LLMs’ reasoning capabilities by program-generated logical reasoning samples. We first establish principles for designing high-quality samples by integrating symbolic logic theory and previous empirical insights. Then, based on these principles, we construct a synthetic corpus named Formal Logic Deduction Diverse (FLD_×𝟚). Finally, we empirically show that ALT on FLD_×𝟚 substantially enhances the reasoning capabilities of state-of-the-art LLMs, including LLaMA-3.1-70B. Improvements include gains of up to 30 points on logical reasoning benchmarks, up to 10 points on math and coding benchmarks, and 5 points on the benchmark suite BBH.

View full abstract

Download PDF (1977K)
Data-to-Text Generation for Esports Game Commentary of Multiplayer Strategy Game

Zihan Wang, Naoki Yoshinaga

2025 Volume 32 Issue 2 Pages 572-597
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.572

JOURNAL FREE ACCESS

Show abstractHide abstract

Esports, a sports competition on video games, has become one of the most important sporting events. Despite the large accumulation of esports play logs, only a small portion are accompanied by text commentaries that help the audience retrieve and understand the plays. In this study, we introduce the task of generating commentaries from esports game’s data records. We begin by building large-scale esports data-to-text datasets that pair structured data records with textual commentaries from a popular esports game, League of Legends. We then explore several generation models to produce game commentaries from structured data records while also examining the impact of pre-trained language models. To assess the generated commentaries, we designed evaluation metrics that focused on the unique characteristics of esports data, such as evaluating strategic depth. The experimental results of the data-to-text generation using our dataset revealed the remaining challenges of this novel task.

View full abstract

Download PDF (1214K)
Using Linguistic Formalism to Improve Real World Understanding for V&L Models: Case Study on Image Discrimination for Structurally Ambiguous Language Input

Lee Sangmyeong, Seitaro Shinagawa, Koichiro Yoshino, Satoshi Nakamura

2025 Volume 32 Issue 2 Pages 598-632
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.598

JOURNAL FREE ACCESS

Show abstractHide abstract

In the context of Real World Understanding (RWU) for vision and language (V&L) models, accurately aligning language with the corresponding visual scene is critical. Since current models typically assume language inputs to be plain text, RWU faces potential issues with structural ambiguity, where a single sentence can have multiple meanings due to various phrase structures. This paper proposes to use linguistic formalism as input, which enriches language information and addresses the issue of structural ambiguity. Our focus is on the Contrastive Language-Image Pre-training (CLIP) model, a prominent V&L model, focusing on image discrimination tasks of RWU. Our experiments test various approaches to incorporating formalism into the CLIP model, depending on the type of formalism and its processing method. We aim to determine the effectiveness of formalism in discriminating ambiguous images and identify which formalism works best. Additionally, we employ a gradient-based method to gain insights into how formalism is interpreted within the model’s architecture.

View full abstract

Download PDF (919K)
Implicit Sense-labeled Connective Recognition

Yui Oka, Daiki Yanamoto, Tsutomu Hirao, Kyosuke Nishida

2025 Volume 32 Issue 2 Pages 633-659
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.633

JOURNAL FREE ACCESS

Show abstractHide abstract

Implicit Discourse Relation Recognition (IDRR) involves identifying the sense label of an implicit connective between adjacent text spans. This has traditionally been approached as a classification task. However, sense labels cannot exhaustively represent all discourse. This paper presents Implicit Sense-labeled Connective Recognition (ISCR), which identifies the implicit connectives as well as their sense labels between adjacent text spans. ISCR can be treated as a classification task, but it’s actually difficult due to the large number of potential categories, the use of sense labels, and the uneven distribution of instances among them. Accordingly, this paper instead handles ISCR as a text-generation task, using an encoder-decoder model to generate both connectives and their sense labels. From our evaluation results, we found that our classification method outperforms the conventional classification-based method.

View full abstract

Download PDF (464K)
Collection of Referring Expressions for Location and Route Information Using Maps as Stimuli

Mai Omura, Yoshiko Kawabata, Hikari Konishi, Masayuki Asahara, Johane ...

2025 Volume 32 Issue 2 Pages 660-678
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.660

JOURNAL FREE ACCESS

Show abstractHide abstract

In this study, we constructed a database of expressions referring to both location and route information through crowdsourcing, and made it publicly available as open data. Twenty maps were used as stimuli, with 40 participants per map asked to describe the location of a target point, resulting in 800 referring expressions. For route information, two routes were defined on each map, and 40 participants per route were asked to describe the route between two points, yielding 1,600 referring expressions. Each expression was evaluated to determine whether it constituted a relative reference based on landmarks on the map. Location-referring expressions were categorized into four types: first-person perspective, within-space perspective, within-space movement, and bird’s-eye view. Route-referring expressions were labeled according to the presence of information about the starting point, waypoints, and the endpoint. Additionally, a survey was conducted to assess the comprehensibility of each expression, and the resulting data were collected accordingly.

View full abstract

Download PDF (764K)

Society Column (Non Peer-Reviewed)

Achievements and Challenges in Japanese Question Answering: Insights from Quiz Competition Results

Tomoki Ariyama, Masatoshi Suzuki

2025 Volume 32 Issue 2 Pages 679-683
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.679

JOURNAL FREE ACCESS

Download PDF (249K)
Research Background on Second Language Acquisition in Neural Language Models

Miyu Oba

2025 Volume 32 Issue 2 Pages 684-690
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.684

JOURNAL FREE ACCESS

Download PDF (349K)
Investigation of the Inference Capabilities and Memorization of Pre-trained Language Models

Yusuke Sakai

2025 Volume 32 Issue 2 Pages 691-698
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.691

JOURNAL FREE ACCESS

Download PDF (594K)
Theme Session in NLP2025 “Narutal Language Processing for the Financial and Economic Domain”

Kei Nakagawa

2025 Volume 32 Issue 2 Pages 699-703
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.699

JOURNAL FREE ACCESS

Download PDF (227K)
Thematic Session 2: Dialogue Systems and Language Use for Human-AI Symbiosis

Mayumi Usami, Tetsuro Takahashi, Hiroyuki Nishikawa, Ryuichiro Higashi ...

2025 Volume 32 Issue 2 Pages 704-712
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.704

JOURNAL FREE ACCESS

Download PDF (337K)
Cognition, Brain, and Natural Language Processing

Satoshi Nishida, Ichiro Kobayashi, Yohei Oseki, Shohei Hidaka, Hitomi ...

2025 Volume 32 Issue 2 Pages 713-719
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.713

JOURNAL FREE ACCESS

Download PDF (281K)
NLP2025 Theme Session “AIWolf: Conversation Game of Liar Detection and Persuation with LLM”

Yoshinobu Kano, Fujio Toriumi, Michimasa Inaba, Hirotaka Osawa, Daisuk ...

2025 Volume 32 Issue 2 Pages 720-726
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.720

JOURNAL FREE ACCESS

Download PDF (270K)
2nd Thematic Session on the Emergence of Language and Communication

Ryo Ueda

2025 Volume 32 Issue 2 Pages 727-732
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.727

JOURNAL FREE ACCESS

Download PDF (274K)
The Present and Future of Humanities and Language Processing

Hisako Usui, Hiroki Ouchi, Yuzuki Tsukagoshi, So Miyagawa

2025 Volume 32 Issue 2 Pages 733-737
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.733

JOURNAL FREE ACCESS

Download PDF (271K)
NLP2025 Workshop: Present and Future of Natural Language Evaluation in the LLM Era

Katsuhito Sudoh, Mamoru Komachi, Tomoyuki Kajiwara, Masato Mita

2025 Volume 32 Issue 2 Pages 738-745
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.738

JOURNAL FREE ACCESS

Download PDF (305K)
NLP2025 Workshop: Fine-Tuning and Evaluation for Large Language Models

Tsuyoshi Okita, Satoru Katsumata, Keisuke Kamata, Hirokazu Kiyomaru, ...

2025 Volume 32 Issue 2 Pages 746-750
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.746

JOURNAL FREE ACCESS

Download PDF (243K)
NLP2025 Workshop on Japanese Language Resources (JLR2025)

Masayuki Asahara, Takahiko Ito, Mai Omura, Daisuke Kawahara, Takahiro ...

2025 Volume 32 Issue 2 Pages 751-754
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.751

JOURNAL FREE ACCESS

Download PDF (658K)

Supporting Member Column (Non Peer-Reviewed)

Dataflow Architecture Unlocking Next-Generation Natural Language Processing

Masahiko Nakano

2025 Volume 32 Issue 2 Pages 755-760
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.755

JOURNAL FREE ACCESS

Download PDF (319K)

Information (Non Peer-Reviewed)

[title in Japanese]

2025 Volume 32 Issue 2 Pages 761-764
Published: 2025
Released on J-STAGE: June 15, 2025

DOIhttps://doi.org/10.5715/jnlp.32.761

JOURNAL FREE ACCESS

Download PDF (327K)

Register with J-STAGE for free!