Chinese Person Name Disambiguation Based on Two-Stage Clustering

Jie Zhou; Bicheng Li; Yongwang Tang

doi:10.20965/jaciii.2016.p0755

抄録

Person name clustering disambiguation is the process that partitions name mentions according to corresponding target person entities in reality. The existed methods can not realize effective identification of important features to disambiguate person names. This paper presents a method of Chinese person name disambiguation based on two-stage clustering. This method adopts a stage-by-stage processing model to identify and utilize different types of important features. Firstly, we extract three kinds of core evidences namely direct social relation, indirect social relation and common description prefix, recognize document-pairs referring to the same person entity, and realize initial clustering of person names with high precision. Then, we take the result of initial clustering as new initial input, utilize the statistical properties of multi-documents to recognize and evaluate important features, and build a double-vector representation of clusters (cluster feature vector and important feature vector). Based on the processes above, the final clustering of person names is generated, and the recall of clustering is improved effectively. The experiments have been conducted on the dataset of CLP2010 Chinese person names disambiguation, and experimental results show that this method has good performance in person name clustering disambiguation.

著者関連情報

この記事は最新の被引用情報を取得できません。

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JACIII Official Site.
https://www.fujipress.jp/jaciii/jc-about/

お気に入り & アラート

閲覧履歴

Entity-fishing: A DARIAH Entity Recognition and Disambiguation Service

創刊号からの全論文のPDFは
JACIII公式サイトで公開中(無料)
doiリンクをクリック！

責任著者(Corresponding author)

訂正情報

J-STAGEへの登録はこちら（無料）