We present a system that tries to automatically collect and monitor Japanese blog collections that include not only ones made with blog softwares but also ones written as normal web pages. Our approach is based on extraction of date expressions and analysis of HTML documents. Our system also extracts and mines useful information from the collected blog pages.
Weblog (blog) is a kind of web contents which represent people's interests straightforwardly. We propose content browsing methods integrating blog information, in order to provide "human-aware" content browsing experience. In this paper, we discuss how blog information changes current content browsing experience.
Recently there are many blog hosting services in Japan and users are getting larger and larger. The Blog users seem to make new communities on the Internet. In this paper, we extract and analyze Blog communities based on the link structure of Blog. Blog networks can be constructed as follows: a Blog (whole entries of one Blog) is considered as a node, and a link between two Blogs is considered as an edge. We try two experiments to two areas: scoring Blog pages by PageRank and clustering Blog communities by betweenness clustering for the Blog networks on "baseball" and "Winny."