Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Current issue
Displaying 1-20 of 20 articles from this issue
Preface (Non Peer-Reviewed)
General Paper (Peer-Reviewed)
  • Shun Inadumi, Seiya Kawano, Akishige Yuguchi, Yasutomo Kawanishi, Koic ...
    2025 Volume 32 Issue 1 Pages 3-35
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    Situated conversations, which refer to visual information as visual question answering (VQA), often contain ambiguities caused by reliance on directive information. Word ellipsis exacerbates the problem on some languages such as Japanese, which happens on central contents of the questions such as subjects or objects. Contexts in conversational situations, such as joint attention between a user and a system, or user gaze can clarify such ambiguities in questions. In this study, we propose the Gaze-grounded VQA dataset (LookVQA), which corresponds to directive or ellipsis and gaze information by focusing on a clarification process complemented by gaze information. We also propose a question answering model that utilizes gaze target estimation results to improve the accuracy of LookVQA tasks. Our experimental results showed that our model improved the performance in some question types of LookVQA and outperformed a VQA baseline.

    Download PDF (2426K)
  • Kotaro Aono, Ryohei Sasano, Koichi Takeda
    2025 Volume 32 Issue 1 Pages 36-54
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    There are several linguistic claims about situations where words are more likely to be used as metaphors. However, few studies have sought to verify such claims against large corpora. This study conducts a large-scale, corpus-based analysis of claims about metaphors, by applying automatic metaphor detection to sentences extracted from Common Crawl and using the statistics obtained from the results. Specifically, we verified a total of five claims: three claims concerning the direct objects of the verb metaphors and two claims concerning emotional polarity and subjectivity. The verification results support all of the five claims and indicate that the direct objects of verbs used as metaphors tend to have lower degrees of concreteness, imageability, and familiarity, and that metaphors are more likely to be used in emotional and subjective sentences.

    Download PDF (378K)
  • Mai Omura, Aya Wakasa, Hiroshi Matsuda, Masayuki Asahara
    2025 Volume 32 Issue 1 Pages 55-90
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    In this study, we report the development and construction of the universal dependencies-based Japanese spoken language treebank (UD_Japanese-CEJC), a conversion of the corpus of everyday Japanese conversation (CEJC) into the universal dependencies format. The CEJC is a large-scale spoken language corpus that includes various everyday Japanese conversations, annotated with word boundaries and morphological information. For the UD Japanese-CEJC, we annotated the CEJC with long-unit morphological and phrase dependency information. It was constructed according to manually refined conversion rules from the CEJC, using morphological information and Bunsetsu phrase-based syntactic dependencies. We examined various issues related to UD constructions in the CEJC by comparing it with a written Japanese corpus and evaluating UD parsing accuracy.

    Download PDF (1601K)
  • Hiyori Yoshikawa, Naoaki Okazaki
    2025 Volume 32 Issue 1 Pages 91-113
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    The recent progress in the performance of large-scale language models has necessitated the detection of errors in generated content. One approach for detecting errors in language model generation is to estimate the confidence of the generated content based on the information available at the time of generation. Existing methods mainly use model outputs and internal states; however, the setting in which the training data of language models can be accessed has not been fully explored. This study examines the usefulness of training data in estimating the confidence of the output of learned language models. We trained a medium-scale language model, built a data store consisting of the full text of training data, as well as examined and evaluated several confidence estimation methods based on the training data. Experimental results using a language model knowledge evaluation task confirmed that combining predictive likelihood and information on relevant cases in the training data improves the accuracy of confidence estimation compared to when the training data is not used.

    Download PDF (896K)
  • Ryuta Ishikawa, Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura
    2025 Volume 32 Issue 1 Pages 114-133
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    Although neural machine translation (NMT) usually produces high-quality translation through flexible word choice and fluency, its quality can decrease for long input sentences. A divide-and-conquer approach to this problem exists that splits a long input sentence into shorter segments and merges their translations, resulting in limited improvement in NMT. In this study, we propose a novel divide-and-conquer method for NMT that improves the translation of long sentences in an intra-sentence context. The proposed method (1) splits a sentence around coordinating conjunctions, connecting clauses labeled S by syntactic parsing, (2) translates these clauses using a clause-level translation model that utilizes an intra-sentence context, and (3) merges clause-level translations using another sequence-to-sequence model to obtain a sentence-level translation. In our English-to-Japanese translation experiments on ASPEC using a pre-trained multilingual BART model, the proposed method outperformed a baseline multilingual BART-based NMT for input sentences with over 40 words.

    Download PDF (1954K)
  • Tatsuya Aoki, Jey Han Lau, Hidetaka Kamigaito, Hiroya Takamura, Timoth ...
    2025 Volume 32 Issue 1 Pages 134-175
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    User-generated texts contain not only non-standard words such as b4 for before, but unusual word usages such as catfish for a person who uses fake identity online, which requires knowledge about the words to handle such cases in natural language processing. We present a neural model for detecting the non-standard usages in social media text. To deal with the lack of training data for this task, we propose a method for synthetically generating pseudo non-standard examples from a corpus, which enables us to train the model without manually-annotated training data and for any arbitrary language. Experimental results on Twitter and Reddit datasets show that our proposed method achieves better performance than existing methods, and is effective across different languages.

    Download PDF (443K)
  • Benjamin Clavié
    2025 Volume 32 Issue 1 Pages 176-218
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    Neural Information Retrieval has advanced rapidly in high-resource languages, but progress in lower-resource ones such as Japanese has been hindered by data scarcity, among other challenges. Consequently, multilingual models have dominated Japanese retrieval, despite their computational inefficiencies and inability to capture linguistic nuances. While recent multi-vector monolingual models like JaColBERT have narrowed this gap, they still lag behind multilingual methods in large-scale evaluations. This work addresses the suboptimal training methods of multi-vector retrievers in lower-resource setting, focusing on Japanese. We systematically evaluate and improve key aspects of the inference and training settings of JaColBERT, and more broadly, multi-vector models. We further enhance performance through a novel checkpoint merging step, showcasing it to be an effective way of combining the benefits of fine-tuning with the generalization capabilities of the original checkpoint. Building on our analysis, we introduce a novel training recipe, resulting in the JaColBERTv2.5 model. JaColBERTv2.5, with only 110 million parameters and trained in under 15 hours on 4 A100 GPUs, significantly outperforms all existing methods across all common benchmarks, reaching an average score of 0.754, significantly above the previous best of 0.720. To support future research, we make our final models, intermediate checkpoints and all data used publicly available.

    Download PDF (267K)
  • Jiannan Mao, Chenchen Ding, Hour Kaing, Hideki Tanaka, Masao Utiyama, ...
    2025 Volume 32 Issue 1 Pages 219-251
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    UDify (Kondratyuk and Straka 2019) is a multilingual, multi-task parser fine-tuned on mBERT that achieves remarkable performance on high-resource languages. However, on some low-resource languages, its performance saturates early and decreases gradually as training proceeds. To address this issue, this study applies a data augmentation method to improve parsing performance. We conducted experiments on five few-shot and three zero-shot languages to test the effectiveness of this approach. The unlabeled attachment scores were improved on the zero-shot language dependency parsing tasks, with the average score increasing from 55.6% to 59.0%. Meanwhile, dependency parsing tasks in high-resource languages and other Universal Dependencies tasks were almost unaffected. The experimental results demonstrate that the data augmentation method is effective for low-resource languages in multilingual dependency parsing. Furthermore, our experiments confirm that continuously increasing the quantity of synthetic data enhances UDify's performance. This improvement was particularly effective for zero-shot target languages.

    Download PDF (1129K)
  • Aru Maekawa, Satoshi Kosugi, Kotaro Funakoshi, Manabu Okumura
    2025 Volume 32 Issue 1 Pages 252-282
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    Dataset distillation aims to compress a training dataset by creating a few informative synthetic samples such that the neural networks trained on them perform as best as those trained on the original training dataset. Current text dataset distillation methods create each synthetic sample as a sequence of word embeddings instead of text data to apply gradient-based optimization; however, such embedding-level distilled datasets cannot be used for training other models whose word embedding weights are different from the model used for distillation. To address this issue, we propose a novel text dataset distillation approach, called distilling dataset into language model (DiLM), which trains a language model to generate informative synthetic training samples as text data, rather than directly optimizing synthetic samples. We evaluated DiLM on various text classification datasets and showed that the distilled synthetic datasets from DiLM outperformed those from the current coreset selection methods. DiLM achieved remarkable generalization performance in training different types of models and in the in-context learning of large language models. Our code is available at https://github.com/arumaekawa/DiLM.

    Download PDF (627K)
  • Aru Maekawa, Naoki Kobayashi, Kotaro Funakoshi, Manabu Okumura
    2025 Volume 32 Issue 1 Pages 283-299
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    Dataset distillation aims to create a small dataset of informative synthetic samples to rapidly train neural networks that retain the performance of the original dataset. In this study, we focus on constructing distilled few-shot datasets for natural language processing (NLP) tasks to fine-tune pre-trained transformers. Specifically, we propose introducing attention labels, which can efficiently distill knowledge from the original dataset and transfer it to transformer models via attention probabilities. We evaluated our dataset distillation methods in four NLP tasks and demonstrated that it is possible to create distilled few-shot datasets with attention labels, yielding an impressive performance for fine-tuning BERT. Specifically, in AGNews, which is a four-class news classification task, our distilled few-shot dataset achieved up to 93.2% accuracy, which is 98.5% that of the original dataset, even with only one sample per class and only one gradient step.

    Download PDF (157K)
  • Asahi Yoshida, Yoshihide Kato, Shigeki Matsubara
    2025 Volume 32 Issue 1 Pages 300-329
    Published: 2025
    Released on J-STAGE: March 15, 2025
    JOURNAL FREE ACCESS

    Negation scope resolution is a technique that identifies the part of a sentence affected by the negation cue. The three major corpora used for it, the BioScope corpus, the SFU review corpus, and the Sherlock dataset, have different annotation schemes for negation scope. Due to the different annotations, it is difficult to use the three corpora together in the study of negation scope resolution. To address this issue by merging the corpora into a unified dataset based on a common annotation scheme, we propose a method for automatically converting the scopes of BioScope and SFU to those of Sherlock. We conducted an experiment to evaluate the accuracy of our method using a dataset obtained by manually annotating the negation scopes to a tiny portion of BioScope and SFU, verifying that our method can convert the scopes with high accuracy. In addition, we conducted another experiment to verify the effectiveness of our method from a pragmatic perspective, where we fine-tuned PLM-based negation scope resolution models using the unified dataset obtained by our method. The results demonstrated that the performances of the models increase when fine-tuned on the unified dataset, unlike the simply combined one, which supports the effectiveness of our method.

    Download PDF (388K)
Society Column (Non Peer-Reviewed)
Information (Non Peer-Reviewed)
feedback
Top