複数文質問を対象とした抽出型および生成型要約

石垣 達也; 高村 大也; 奥村 学

doi:10.5715/jnlp.26.37

Abstract

Questions are asked in many situations such as sessions at conferences and inquiries through emails. In such situations, questions can be often lengthy and hard to understand, because they often contain peripheral information in addition to the main focus of the question. Thus, we propose the task of question summarization. In this research, we firstly analyzed question-summary pairs extracted from a Community Question Answering (CQA) site, and found that there exists the questions that cannot be summarized by extractive approaches, but abstractive approaches are required. We created a dataset by regarding the question-title pairs posted on a CQA site as question-summary pairs. By using the data, we trained extractive and abstractive summarization models, and compared them based on the ROUGE score and manual evaluation. Our experimental results show an abstractive method, the encoder-decoder with the copying mechanism, achieves better scores both on ROUGE-2 F-measure and the evaluation by human judges.

Content from these authors

Licensed under CC BY 4.0
https://creativecommons.org/licenses/by/4.0/

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!