Journal of Advanced Computational Intelligence and Intelligent Informatics
Online ISSN : 1883-8014
Print ISSN : 1343-0130
ISSN-L : 1883-8014
Regular Papers
Chinese Person Name Disambiguation Based on Two-Stage Clustering
Jie ZhouBicheng LiYongwang Tang
著者情報
ジャーナル オープンアクセス

2016 年 20 巻 5 号 p. 755-764

詳細
抄録

Person name clustering disambiguation is the process that partitions name mentions according to corresponding target person entities in reality. The existed methods can not realize effective identification of important features to disambiguate person names. This paper presents a method of Chinese person name disambiguation based on two-stage clustering. This method adopts a stage-by-stage processing model to identify and utilize different types of important features. Firstly, we extract three kinds of core evidences namely direct social relation, indirect social relation and common description prefix, recognize document-pairs referring to the same person entity, and realize initial clustering of person names with high precision. Then, we take the result of initial clustering as new initial input, utilize the statistical properties of multi-documents to recognize and evaluate important features, and build a double-vector representation of clusters (cluster feature vector and important feature vector). Based on the processes above, the final clustering of person names is generated, and the recall of clustering is improved effectively. The experiments have been conducted on the dataset of CLP2010 Chinese person names disambiguation, and experimental results show that this method has good performance in person name clustering disambiguation.

著者関連情報

この記事は最新の被引用情報を取得できません。

© 2016 Fuji Technology Press Ltd.

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JACIII Official Site.
https://www.fujipress.jp/jaciii/jc-about/
前の記事 次の記事
feedback
Top