The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Abstract Details
Activity Number:
|
348
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, August 2, 2011 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Section on Statistical Computing
|
Abstract - #300570 |
Title:
|
Large-scale Parallel Statistical Forecasting Computations in R
|
Author(s):
|
Murray Stokely and Farzan Rohani*+ and Nate Coehlo and Eric Tassone
|
Companies:
|
Google Inc. and Google Inc. and Google Inc. and Google Inc.
|
Address:
|
, , ,
|
Keywords:
|
R ;
distributed computing ;
forecasting ;
simulation ;
statistical computing ;
timeseries
|
Abstract:
|
We demonstrate the utility of massively parallel computational infrastructure for statistical computing using the MapReduce paradigm for R. This framework allows users to write computations in a high-level language that are then broken up and distributed to worker tasks in Google datacenters. Results are collected in a scalable, distributed data store and returned to the interactive user session. We apply our approach to a forecasting application that fits a variety of models, prohibiting an analytical description of the statistical uncertainty associated with the overall forecast. To overcome this, we generate simulation-based uncertainty bands, which necessitates a large number of computationally intensive realizations. Our technique cut total run time by a factor of 300. Distributing the computation across many machines permits analysts to focus on statistical issues while answering
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2011 program
|
2011 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.