Abstract
Recently, natural language processing researches pay attention to the data or processing technique for paraphrase. Unfortunately, we have not many data for paraphrase. There are some research reports with collecting the synonymous expression with parallel corpus. However, suitable corpus for collecting the set of paraphrase is not available. Then, we get a few variations of expression in the paraphrase set when we tried in this method with parallel corpus. In this paper, we proposed the grouping method based on the basic idea as grouping the synonymous sentences related with the translation recursively and decomposed the wrong group using DMdecomposition algorithm. The wrong groups are included the expression that cannot be paraphrase caused some words or expressions have different meanings in different situations. We discuss our method and experimental result with BTEC that is multilingual parallel corpus.