JSAI Technical Report, Type 2 SIG
Online ISSN : 2436-5556
Indexing based on term co-occurrence and frequency
Sohei OKUIAkihiro INOKUCHI
Author information
RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

2015 Volume 2015 Issue DOCMAS-009 Pages 02-

Details
Abstract

In this paper, we propose two models to weight each term in the document for document retrieval. Our idea of the models come from traditional Term Frequencies (TFs) and Term Weights (TWs) proposed in 2013. TF is based on the number of term occurrences in a document and used as de facto standard. On the other hand, TW is based on variation of term co-occurrences in a document and outperforms to TF. Our proposed models give much weight to terms which cooccur with terms frequently occur. We show experimental results comparing to the conventional models using a very large text corpus.

Content from these authors
© 2015 Authors
Next article
feedback
Top