2017 年 32 巻 5 号 p. D-H33_1-4
This paper presents a novel metric for evaluating stability of machine translation system. A stable system indicates that it keeps almost the same outputs given the inputs with slight changes. In this paper, we propose a stability metric by exploiting TER metric for evaluating the differences between the two texts. We have built an evaluation data set, and demonstrate that a neural-based method is unstable rather than a statistical-based method, while the former outperforms the latter.