Linked Open Data (LOD) has a graph structure in which nodes are represented by URIs, and thus LOD sets are connected and searched through different domains. In fact, however, 5% of the values are literal (string without URI) even in DBpedia, which is a
de facto hub of LOD. Therefore, this paper proposes a method of identifying and aggregating literal nodes in order to give a URI to literals that have the same meaning and to promote data linkage. Our method regards part of the LOD graph structure as a block image, and then extracts image features based on Scale-Invariant Feature Transform (SIFT), and performs ensemble learning, which is well known in the field of computer vision. In an experiment, we created about 30,000 literal pairs from a Japanese music category of DBpedia Japanese and Freebase, and confirmed thatthe proposed method correctly determines literal identity with F-measure of 76--85%.
View full abstract