Users express their information needs in terms of queries in search engines to Lnd some relevant documents on the Internet. However, users' search queries are usually short, ambiguous and/or underspeci Led. Sometimes, users have been found to struggle formulating queries based on keywords given their limited vocabulary. To help users in formulating query, query suggestion by mining query logs plays an important role and has been attracted attention in the recent years. A query log is generally represented as a bipartite graph on a query set and a URL set. Most traditional approaches used the raw click frequency to weigh the link between a query and a URL on the click graph. In order to alleviate the spurious effects of raw click frequency, some entropy-biased model by incorporating raw click frequency with the inverse query frequency or inverse URL frequency was proposed as the weighting scheme for query representation. In this paper, we observe that popular query and URLs are very diverse in nature, and user click frequency can be considered as local property of the URL, and link structures of query and URLs from both sides of the bipartite graph can be considered as a global property on the click graph. Based on this understanding, we develop a weighting scheme to weight the link between a query and an URL in the bipartite click graph by incorporating the user click frequency, and the link structures of the query and URL from both sides with global consistency in a consistent manner. We conduct experiments on the AOL search engine query log dataset and evaluate the query suggestions by estimating the similarity between the user query and suggested query using the knowledge of the Dmoz open directory project. The results turns out that our global consistency scheme achieved better performance than the current entropy-biased model.
抄録全体を表示