Abstract
Distinguishing different people with identical names is becoming more and more important in person searches on the Web. The aim of this research is to dispatch useful labels for identifying persons in “person clusters,” which are generated as a result of person searches on the Web. In this paper, we propose a method to label person clusters with “vocation-related information.” The vocation-related information includes broader terms that may be considered as vocations, and terms that are useful to infer vocations, not only those rigorously defined as vocations. Our method is based on (a) extracting candidates of vocation-related information by using HTML structures and simple heuristics, and (b) generating vocation-related information by using term frequencies,synonym clustering, and Web search engines. Experimental results revealed the usefulness of the proposed method.