2024 年 39 巻 6 号 p. C-O42_1-14
Entity Linking (EL) is a technology that links mentions (context-dependent token sequences that may refer to specific entities) in a text to corresponding entities in a knowledge base. It serves as a foundational technology in knowledge processing and natural language processing. Most research on EL focuses on English, and there is little research that targets Japanese. While research on multilingual EL models, which can be used for languages other than English, including Japanese, is advancing, EL that considers the characteristics of individual languages has not been fully achieved. Additionally, recent EL research utilizes word embeddings and knowledge graph embeddings in the construction of EL models, and building an EL model for a specific language requires language-specific adaptations. Pointer Network has the feature of selecting elements of the output sequence from the input sequence. By using Pointer Network, it may be possible to perform EL that is not included in the training data. In this study, we proposed a Japanese EL model based on the Pointer Network (Japanese PNEL model) forWikidata. We evaluated the accuracy of the Japanese PNEL model and analyzed effective features in Japanese EL and English EL.In addition, we have constructed a Japanese EL evaluation dataset by machine translating the English EL evaluation datasets WebQSP, SimpleQuestions, and LC-QuAD2 into Japanese.From the results of comparison experiments with existing multilingual models, we confirmed that the Japanese PNEL model outperformed existing multilingual models in terms of F1 score in evaluation experiments targeting WebQSPs that were machine-translated into Japanese.We analyzed more effective features through experiments on the impact of the difference in the number of dimensions and models for word embedding and knowledge graph embedding on EL accuracy. The results of the ablation study showed that knowledge graph embedding is also as effective in Japanese as in English.