Transactions of the Japanese Society for Artificial Intelligence
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
Original Paper
[title in Japanese]
Masatoshi TsuchiyaTakuto Watarai
Author information
JOURNAL FREE ACCESS

2022 Volume 37 Issue 4 Pages A-LC3_1-12

Details
Abstract

This paper describes our dataset of Japanese cloze questions designed for the evaluation of machine reading comprehension. The dataset consists of questions automatically generated from Aozora Bunko, and each question is defined as a 4-tuple: a context passage, a query holding a slot, an answer character, and a set of possible answer characters. The query is generated from the original sentence, which appears immediately after the context passage on the target book, by replacing the answer character with the slot. The set of possible answer characters consists of the answer character and the other characters who appear in the context passage. Because the context passage and the query share the same context, a machine that precisely understands the context may select the correct answer from the set of possible answer characters. The unique point of our approach is that we focus on characters of target books as slots to generate queries from original sentences because they play important roles in narrative texts and a precise understanding of their relationship is necessary for reading comprehension. To extract characters from target books, manually created dictionaries of characters are employed because some characters appear as common nouns, not as named entities.

Content from these authors
© The Japanese Society for Artificial Intelligence 2022
Next article
feedback
Top