International Journal of Biomedical Soft Computing and Human Sciences: the official journal of the Biomedical Fuzzy Systems Association
Online ISSN : 2424-256X
Print ISSN : 2185-2421
ISSN-L : 2185-2421
Finding Rare Information from the Web Using Social Bookmarks and Word Co-occurrence
Takayuki YUMOTO Takahiro YAMANAKAManabu NIINaotake KAMIURA
Author information
JOURNAL OPEN ACCESS

2017 Volume 22 Issue 1 Pages 9-18

Details
Abstract
We propose methods to find rare information from the Web using two approaches. We define rare information as relevant and atypical information. In the first approach, we use social bookmark data. Especially, we focus on tags which are sometimes used to annotate topics of the pages. The second approach is based on word co-occurrence in a orpus. In both approaches, we use conditional probabilities to express relevancy and atypicality. In experiments, we compared our methods with the relevance-oriented method, the diversity-oriented method, and another rarity-oriented method. We prepared typical and atypical pages for ten queries, and ranked them using each method. Our methods using word co-occurrence obtained better nDCG scores than the other methods.
Content from these authors
© 2017 Biomedical Fuzzy Systems Association
Previous article Next article
feedback
Top