参照記述の表記揺れ同定問題に対するアプローチ

相澤 彰子; 宮田 淳平

doi:10.14864/fss.25.0.144.0

25th Fuzzy System Symposium

Session ID : 2D4-03

DOI https://doi.org/10.14864/fss.25.0.144.0

Conference information

Host: Japan Society for Fuzzy Theory and Intelligent Informatics (SOFT)

On Identifying Names with Notation Variations

*Akiko Aizawa, Jumpei Miyata

Author information

Keywords: Name Matching Problem, Entity Identification, Record Linkage, Scientific Papes Databases, String Similarity Measure

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

Statistical data analysis using legacy databases often requires grouping of mentions that refer to the same real world entity. This type of pre-processing becomes particularly important when dealing with large-scale databases since there exist much variation of names that makes the cost for generating dictionaries or normalization rules infeasible high. Based on this, we investigate, in this paper, methods for automatic name matching and discuss the advantages and disadvantages of (i) a binary classifier which determines whether two mentions refer to the same entity or not and also (ii) a graph-based clustering algorithm which disambiguates two similar mentions using their global features.

Corresponding author

Register with J-STAGE for free!