2008 Volume 2008 Issue DMSM-A801 Pages 07-
We develop nonparametric Bayesian model for extracting relationships between words taken from a corpus. It is aimed at discovering semantic knowledge about words in particular domains. The subject-predicate structure is taken as a syntactic structure with the noun as the subject and the verb as the predicate. This structure is regarded as a graph structure. The generation of this graph can be modeled by using the hierarchical Dirichlet process and the Pitman-Yor process. Evaluation of this model by measuring the performance of graph clustering based on WordNet similarities demonstrated that it outperforms other baseline models.