2016 Volume 23 Issue 5 Pages 437-461
Through using knowledge bases, question answering (QA) systems have come to be able to answer questions accurately over a variety of topics. However, knowledge bases are limited to only a few major languages, and thus it is often necessary to build QA systems that answer questions in one language based on an information source in another language (cross-lingual QA: CLQA). Machine translation (MT) is one tool to achieve CLQA, and it is intuitively clear that a better MT system improves QA accuracy. However, it is not clear whether an MT system that is better for human consumption is also better for CLQA. In this paper, we investigate the relationship between manual and automatic translation evaluation metrics and CLQA accuracy by creating a data set using both manual and machine translation, and performing CLQA using this created data set. As a result, we find that QA accuracy is closely related with a metric that considers frequency of words, and as a result of manual analysis, we identify two factors of translation results that affect CLQA accuracy. One is mistranslation of content words and another is lack of question type words. In addition, we show that using a metric which has high correlation with CLQA accuracy can improve CLQA accuracy by choosing an appropriate translation result from translation candidates.