Host: The Japanese Society for Artificial Intelligence
Name : The 35th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 35
Location : [in Japanese]
Date : June 08, 2021 - June 11, 2021
The broad range of applications in natural language processing and text mining requires the computation of sentence similarities, such as similarity-based text retrieval, automatic evaluation of generated texts. However, these studies have largely ignored multi-word expressions (MWEs), an important component of natural language. MWEs are phrases for which the meaning of the whole phrase cannot be naturally inferred from the meaning of constituent words, such as “hot dog.” Needless to say, when computing the meaning of the whole sentence, accurate processing of the meaning of MWEs is as important as that of each word. To introduce the perspective of MWEs into the study of textual similarity, we attempt to create a new textual similarity dataset requiring semantic computation of MWEs. Specifically, we exploited (1) a combination of back-translation and constrained decoding, and (2) mask prediction by BERT. We showed that our proposed can make balanced sentence similarity evaluation data.