Journal of Japan Society of Library and Information Science
Online ISSN : 2432-4027
Print ISSN : 1344-8668
ISSN-L : 1344-8668
Article
The Extent of the Deep Web in Japanese Institutional Repositories
Yosuke MIYATATeru AGATAAtsushi IKEUCHIEmi ISHITAShuichi UEDA
Author information
JOURNAL OPEN ACCESS

2012 Volume 58 Issue 2 Pages 97-109

Details
Abstract

The more the size of Web increases, the more serious the problem of the deep Web (the Web not accessible to search engines) becomes. McCown et al. (2006) and Hagedorn & Santelli (2008) surveyed extent of deep Web using metadata contained in institutional repositories. In this research, applying the method used in that previous work, we measured the extent of the deep Web on a larger scale using PDF file URLs contained in institutional repositories in Japan in September 2009. The results show that the coverage rate of major search engines (Google, Yahoo! and Bing) is 72%, leaving 28% as the maximum extent of the deep Web. And examination of the characteristics of the files revealed that dynamic URLs and longer URLs are associated with decreased coverage rates for search engines.

Content from these authors
© 2012 Japan Society of Library and Information Science
Previous article Next article
feedback
Top