Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
36th (2022)
Session ID : 4Yin2-17
Conference information

Gap between Semantic Textual Similarity Benchmark Task and Downstream Tasks
*Kaori ABESho YOKOITomoyuki KAJIWARAKentaro INUI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

The Semantic Textual Similarity (STS) task measures the ability to evaluate the similarity between two sentences, which is necessary for downstream tasks such as machine translation evaluation and related passage retrieval. Several NLP researchers discuss the performance of this ability on benchmark dataset. However, there is a possibility that a system that is highly evaluated on the benchmark dataset may not be able to demonstrate appropriate effectiveness in actual downstream tasks. In this study, we examined this gap between STS and downstream tasks, clarified what factors are important in evaluating the similarity between two sentences in the downstream tasks, and discussed a policy for improving the benchmark dataset.

Content from these authors
© 2022 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top