Abstract
In this paper, we propose a similarity measure for extracting certain concord expression from many and large text corpora. The concord expression is called KOOU expression, which is particular to Japanese. KOOU expression consists of KO element and OU element. The KO element is called “declarative adverb” and can decide an expression of verb (OU element). In Japanese, using the KOOU expression, we are able to understand a sentence gradually. So far, the practicaldatabase of KOOU expressions does not exist. We attempt to extract the KOOU expressions. Then, we compare and evaluate seven similarity measures to establish a method of objectively and comprehensively extracting KOOU expressions. We make a judgment data with using a pooling method that is wellknown in information retrieval to evaluate extracted results. We pool top 500 of results for each measure and then judge them by human. From our experiment results, we report Yate's correction and Complementary Similarity Measure are adaptable for extracting KOOU expressionsfrom corpora.