Abstract:
|
Search data has become a rich resource for understanding user behavior and interests. Analyzing aggregated search behavior over time and space gives valuable insights into new trends and people's interests worldwide. We are not only interested in finding current "hot" topics, but also understand the dynamics of consumer behavior and interests for the short- and medium-term future. We use large-scale, semi-parametric probabilistic time series forecasts to leverage the power of "big data" of Google search time series. They are based on predictive state smoothing (PRESS), a novel kernel regression method for general-purpose predictive state representation estimation and algorithms for large scale, distributed computing environments that can learn from millions of time series at a time and produce real-time forecasts for new example time series. Applications to Google search data illustrate the power and accuracy of the presented methods.
|