JSM 2011 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Abstract Details

Activity Number: 348
Type: Contributed
Date/Time: Tuesday, August 2, 2011 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Computing
Abstract - #300570
Title: Large-scale Parallel Statistical Forecasting Computations in R
Author(s): Murray Stokely and Farzan Rohani*+ and Nate Coehlo and Eric Tassone
Companies: Google Inc. and Google Inc. and Google Inc. and Google Inc.
Address: , , ,
Keywords: R ; distributed computing ; forecasting ; simulation ; statistical computing ; timeseries

We demonstrate the utility of massively parallel computational infrastructure for statistical computing using the MapReduce paradigm for R. This framework allows users to write computations in a high-level language that are then broken up and distributed to worker tasks in Google datacenters. Results are collected in a scalable, distributed data store and returned to the interactive user session. We apply our approach to a forecasting application that fits a variety of models, prohibiting an analytical description of the statistical uncertainty associated with the overall forecast. To overcome this, we generate simulation-based uncertainty bands, which necessitates a large number of computationally intensive realizations. Our technique cut total run time by a factor of 300. Distributing the computation across many machines permits analysts to focus on statistical issues while answering

The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2011 program

2011 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.