自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
論文
Measuring the Appropriateness of Automatically Generated Phrasal Paraphrases
Atsushi FujitaSatoshi Sato
著者情報
ジャーナル フリー

2010 年 17 巻 1 号 p. 1_183-1_219

詳細
抄録
The most critical issue in generating and recognizing paraphrases is developing a wide-coverage paraphrase knowledge base. To attain the coverage of paraphrases that should not necessarily be represented at surface level, researchers have attempted to represent them with general transformation patterns. However, this approach does not prevent spurious paraphrases because there is no practical method to assess whether or not each instance of those patterns properly represents a pair of paraphrases. This paper argues on the measurement of the appropriateness of such automatically generated paraphrases, particularly targeting at morpho-syntactic paraphrases of predicate phrases. We first specify the criteria that a pair of expressions must satisfy to be regarded as paraphrases. On the basis of the criteria, we then examine several measures for quantifying the appropriateness of a given pair of expressions as paraphrases of each other. In addition to existing measures, a probabilistic model consisting of two distinct components is examined. The first component of the probabilistic model is a structured N-gram language model that quantifies the grammaticality of automatically generated expressions. The second component approximates the semantic equivalence and substitutability of the given pair of expressions on the basis of the distributional hypothesis. Through an empirical experiment, we found (i) the effectiveness of contextual similarity in combination with the constituent similarity of morpho-syntactic paraphrases and (ii) the versatility of the Web for representing the characteristics of predicate phrases.
著者関連情報
© 2010 The Association for Natural Language Processing
前の記事 次の記事
feedback
Top