2015 Volume 2015 Issue DOCMAS-009 Pages 02-
In this paper, we propose two models to weight each term in the document for document retrieval. Our idea of the models come from traditional Term Frequencies (TFs) and Term Weights (TWs) proposed in 2013. TF is based on the number of term occurrences in a document and used as de facto standard. On the other hand, TW is based on variation of term co-occurrences in a document and outperforms to TF. Our proposed models give much weight to terms which cooccur with terms frequently occur. We show experimental results comparing to the conventional models using a very large text corpus.