Journal of Advanced Computational Intelligence and Intelligent Informatics
Online ISSN : 1883-8014
Print ISSN : 1343-0130
ISSN-L : 1883-8014
Regular Papers
Representation Learning with LDA Models for Entity Disambiguation in Specific Domains
Shengchen JiangYantuan XianHongbin WangZhiju ZhangHuaqin Li
Author information
JOURNAL OPEN ACCESS

2021 Volume 25 Issue 3 Pages 326-334

Details
Abstract

Entity disambiguation is extremely important in knowledge construction. The word representation model ignores the influence of the ordering between words on the sentence or text information. Thus, we propose a domain entity disambiguation method that fuses the doc2vec and LDA topic models. In this study, the doc2vec document is used to indicate that the model obtains the vector form of the entity reference item and the candidate entity from the domain corpus and knowledge base, respectively. Moreover, the context similarity and category referential similarity calculations are performed based on the knowledge base of the upper and lower relation domains that are constructed. The LDA topic model and doc2vec model are used to obtain word expressions with different meanings of polysemic words. We use the k-means algorithm to cluster the word vectors under different topics to obtain the topic domain keywords of the text, and perform the similarity calculations under the domain keywords of the different topics. Finally, the similarities of the three feature types are merged and the candidate entity with the highest similarity degree is used as the final target entity. The experimental results demonstrate that the proposed method outperforms the existing model, which proves its feasibility and effectiveness.

Content from these authors

This article cannot obtain the latest cited-by information.

© 2021 Fuji Technology Press Ltd.
Previous article Next article
feedback
Top