2015 Volume 10 Issue 2 Pages 294-304
Entity-centric search has become a demanding problem for many domains on the Web. In particular, the suitable contextualization of result documents poses challenges in terms of selecting most adequate indexing terms for later retrieval. This holds even more, if no generally recognized ontologies for the respective domain are available. In this paper, we show that cross-domain ontology terms are actually more useful for indexing, than salient keywords taken from the documents. Moreover, learning typical contexts for groups of entities from collections indexed by strong cross-domain ontologies can considerably improve retrieval effectiveness. Our extensive experiments prove these results on real world document collections from the area of chemistry and computer science. In fact, our evaluation in different document retrieval scenarios show a vital increase of retrieval precision of up to 87% using documents annotated with cross-domain ontology terms as compared to 53% for BM25 searches and 43% for documents annotated with Wikipedia categories.