2014 Volume 2014 Issue SWO-032 Pages 07-
Linked Open Data (LOD) has a graph structure, where nodes are represented by URIs, and thus LOD sets are connected and searched through different domains. In fact, however, 5% of values are literal (string without URI) even in DPpedia, which is a defacto hub of LOD. Therefore, this paper proposes a method to identify and aggregate the literal nodes in order to give an URI to the literals of the same meaning and to promote the data linkage. Our method regards part of the LOD graph structure as an block image, and then extracts image features based on SIFT, and performs an ensemble learning which is well known in the area of Computer-Vision. In the experiment, we created about 30,000 literal pairs from the Japanese music category of DPpedia Japanese and Freebase, and confirmed that the propsed method correctly determines the literal identity in F-measure of 99%.