人名は検索語として，しばしば検索エンジンに入力される．しかし，この入力された人名に対して，検索エンジンは，いくつかの同姓同名人物についての Web ページを含む長い検索結果のリストを返すだけである．この問題を解決するために，Web 検索結果における人名の曖昧性解消を目的とした従来研究の多くは，凝集型クラスタリングを適用している．一方，本研究では，ある種文書に類似した文書をマージする半教師有りクラスタリングを用いる．我々の提案する半教師有りクラスタリングは，種文書を含むクラスタの重心の変動を抑えるという点において，新規性がある．
Traditionally, many researchers have addressed word sense disambiguation (WSD) as an independent classification problem for each word in a sentence. However, the problem with their approaches is that they disregard the interdependencies of word senses. Additionally, since they construct an individual sense classifier for each word, their method is limited in its applicability to the word senses for which training instances are served. In this paper, we propose a supervised WSD model based on the syntactic dependencies of word senses. In particular, we assume that strong dependencies between the sense of a syntactic head and those of its dependents exist. We describe these dependencies on the tree-structured conditional random fields (T-CRFs), and obtain the most appropriate assignment of senses optimized over the sentence. Furthermore, we incorporate these sense dependencies in combination with various coarse-grained sense tag sets, which are expected to relieve the data sparseness problem, and enable our model to work even for words that do not appear in the training data. In experiments, we display the appropriateness of considering the syntactic dependencies of senses, as well as the improvements by the use of coarse-grained tag sets. The performance of our model is shown to be comparable to those of state-of-the-art WSD systems. We also present an in-depth analysis of the effectiveness of the sense dependency features by showing intuitive examples.