Mathematical Linguistics
Online ISSN : 2433-0302
Print ISSN : 0453-4611
Resource
J-TOCC: Japanese Topic-Oriented Conversation Corpus
Naoki NakamataYoko OtaEri KatoHiroko SawadaYukiko ShimizuAtsushi Mori
Author information
JOURNAL OPEN ACCESS

2021 Volume 33 Issue 1 Pages 11-21

Details
Abstract

We constructed the Japanese Topic-Oriented Conversation Corpus (J-TOCC) in order to study the influence of topics on the vocabulary, grammar, and discourse strategies in conversation. The main feature of this corpus is that the topics are fixed. University students were asked to engage in conversation of 15 topics, and each conversation was recorded for precisely 5 minutes. This means that conditions other than the topic are controlled. Eleven topics are related to daily life, and four topics are related to society. In total, 120 pairs participated, so 10 hours of conversation were recorded for each topic. J-TOCC contains about 1.6 million words in total. The pairs were balanced in terms of gender combination and recording site. In addition, speakers’ degrees of familiarity on each topic were surveyed and the data are attached to the corpus.

Content from these authors
© 2021 The Mathematical Linguistic Society of Japan

この記事はクリエイティブ・コモンズ [表示 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by/4.0/deed.ja
Previous article Next article
feedback
Top