2016 Volume 24 Issue 6 Pages 937-945
Given a large collection of time-evolving online user activities, such as Google Search queries for multiple keywords of various categories (celebrities, events, diseases, etc...), which consist of d keywords/activities, for l countries/locations of duration n, how can we find patterns and rules? For example, assume that we have the online search volume for “Harry Potter”, “Barack Obama” and “Amazon”, for 232 countries/territories, from 2004 to 2015, which include external shocks, sudden change of search volume, and more. How do we go about capturing non-linear evolutions of local activities and forecasting future patterns? Our goal is to analyze a large collection of time-evolving sequences, and moreover, to find the answer for the following issues: (a) Are there any important external shocks/events relating to the keywords in the sequences? (b) If there are, can we automatically detect them? (c) Are there any countries/territories which have different reacts to the global trend? In this paper, we present Δ-SPOT, a unifying analytical non-linear model for large scale web search data; as well as an efficient and effective fitting algorithm, which solves the above problems. Δ-SPOT can also forecast long-range future dynamics of the keywords/queries. Extensive experiments on real data show that our method outperforms other effective methods of non-linear mining in terms of accuracy in both fitting and forecasting.