人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
原著論文
多言語Wikipediaエントリを知識源とする特定トピックの日英ブログサイト検索と日英対照ブログ分析
中崎 寛之川場 真理子横本 大輔宇津呂 武仁福原 知宏
著者情報
ジャーナル フリー

2010 年 25 巻 5 号 p. 613-622

詳細
抄録

The overall goal of this paper is to cross-lingually analyze multilingual blogs collected with a topic keyword. The framework of collecting multilingual blogs with a topic keyword is designed as the blog feed retrieval procedure. In this paper, we take an approach of collecting blog feeds rather than blog posts, mainly because we regard the former as a larger information unit in the blogosphere and prefer it as the information source for cross-lingual blog analysis. In the blog feed retrieval procedure, we also regard Wikipedia as a large scale ontological knowledge base for conceptually indexing the blogosphere. The underlying motivation of employing Wikipedia is in linking a knowledge base of well known facts and relatively neutral opinions with rather raw, user generated media like blogs, which include less well known facts and much more radical opinions. In our framework, first, in order to collect candidates of blog feeds for a given query, we use existing Web search engine APIs, which return a ranked list of blog posts, given a topic keyword. Next, we re-rank the list of blog feeds according to the number of hits of the topic keyword as well as closely related terms extracted from the Wikipedia entry in each blog feed. We compare the proposed blog feed retrieval method to existing Web search engine APIs and achieve significant improvement. We then apply the proposed blog distillation framework to the task of cross-lingually analyzing multilingual blogs collected with a topic keyword. Here, we cross-lingually and cross-culturally compare less well known facts and opinions that are closely related to a given topic. Results of cross-lingual blog analysis support the effectiveness of the proposed framework.

著者関連情報
© 2010 JSAI (The Japanese Society for Artificial Intelligence)
前の記事 次の記事
feedback
Top