Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
35th (2021)
Session ID : 4J3-GS-6f-02
Conference information

JSICK: Japanese Sentences Involving Compositional Knowledge Dataset
*Hitomi YANAKAKoji MINESHIMA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

This paper introduces JSICK, a Japanese dataset for Recognizing Textual Entailment (RTE) and Semantic Textual Similarity (STS), manually translated from the English dataset SICK that focuses on compositional aspects of natural language inferences. Each sentence in JSICK is annotated with semantic tags to analyze whether models can capture diverse semantic phenomena. We perform a baseline evaluation of BERT-based RTE and STS models on JSICK, as well as a stress test in terms of word order scrambling in the JSICK test set. The results suggest that there is room for improving the performance on complex inferences and the generalization capacity of the models.

Content from these authors
© 2021 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top