2022 Volume 29 Issue 1 Pages 84-111
People often indirectly present their intentions in texts. For example, if a person said to an operator of a reservation service “I don’t have enough budget.”, it means “Please find a cheaper option for me.” While neural conversation models acquire the ability to generate fluent responses through training on a dialogue corpus, previous corpora did not focus on indirect responses. We create a large-scale dialogue corpus that provides pragmatic paraphrases to advance technology for understanding users’ underlying intentions. Our corpus provides a total of 71,498 indirect-direct utterance pairs accompanied by a multi-turn dialogue history extracted from the MultiWoZ dataset. Besides, we propose three tasks to benchmark the ability of models to recognize and generate indirect and direct utterances. We also investigate the performance of the state-of-the-art pre-trained language models as baselines. We confirmed that the performance of dialogue response generation was improved by transferring the indirect user utterances to direct ones.