２種類の共起辞書を用いた語彙的言い換えに基づくWeb検索システム

熊本 忠彦; 田中 克己

doi:10.1527/tjsai.23.355

抄録

This paper proposes a Web retrieval system that accurately and exhaustively collects the web pages which are related to a user-specified topic from the Web. When users entered a character string as a query into our proposed system, the system lexically paraphrases and expands the character string. Consequently, the system can present more topic-related web pages than conventional search engines do. First, our proposed system extracts nouns, adjectives, verbs, and katakana characters as target words from the query or character string which users entered, obtains candidate words for paraphrasing the target words based on information retrieval on the Web, and tests validity of their paraphrasing using two kinds of co-occurrence dictionaries. Then, the system expands the initial query by replacing zero or more of the target words with the candidate words that were determined to be valid. A distinctive point of the system is that it uses not only a co-occurrence dictionary that describes ``preceding,'' ``following,'' and ``predicate'' relationships between words but also an impression dictionary that describes co-occurrence relationships between words and two contrasting sets of impression words for the validity test. We also evaluated performance of the proposed system on paraphrasing and information retrieval on the Web using seven sample queries. As a result, its effectiveness was proved.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）