Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper (Peer-Reviewed)
Jamp_sp : A Controlled Japanese Temporal Inference Dataset Considering Aspect
Tomoki SugimotoYasumasa OnoeHitomi Yanaka
Author information
JOURNAL FREE ACCESS

2024 Volume 31 Issue 2 Pages 637-679

Details
Abstract

Temporal inference, i.e., natural language inference involving time, is a challenging task because of the complex interaction of various time-related linguistic phenomena, such as tense and aspect. Although various temporal inference datasets have been provided to assess the temporal inference ability of language models, their primary focus is on English and only on a few linguistic phenomena. Therefore, whether Japanese language models can generalize diverse temporal inference patterns is yet to be understood. In this research, we constructed a controlled Japanese temporal inference dataset considering aspect (Jamp_sp), which includes a variety of temporal inference patterns. The training and test data in Jamp_sp can be controlled based on problem attributes such as temporal inference patterns and time formats, thereby allowing a detailed analysis of the generalization capacity of the language models. To accomplish this objective, we trained the language models on the training data before and after the split, and evaluated them on our test data. The results demonstrate that Jamp_sp is a challenging dataset not only for discriminative language models but also for current generative language models, such as GPT-4, and that there is room for improvement in the generalization capacity of these models.

Content from these authors
© 2024 The Association for Natural Language Processing
Previous article Next article
feedback
Top