Abstract Details
Activity Number:
|
210
|
Type:
|
Invited
|
Date/Time:
|
Monday, August 5, 2013 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Education
|
Abstract - #307455 |
Title:
|
Statistical Inference at Google Scale
|
Author(s):
|
nicholas chamandy*+
|
Companies:
|
Google
|
Keywords:
|
massive data ;
data streams ;
google ;
model feedback ;
heavy tails
|
Abstract:
|
Google generates and consumes data on a staggering scale. Massive, distributed data present novel and interesting challenges for the statistician. Statistical models we take for granted are sometimes out of reach, constructing a matrix of dimensions n by p can be pure fantasy, and outliers are the rule, not the exception. Powerful computing tools help remedy our processing woes, but tend to aggregate data into rigid categories of unsophisticated summary statistics. On the other side of the coin, massive data present great opportunities for a statistician. Estimating tiny experimental effect sizes becomes routine, and practical significance is often more elusive than mere statistical significance. Moreover, the power of approaches that pool data across observations can be fully realized. Through examples, I will provide a framing for this New Frontier in statistical thinking, and highlight interesting tradeoffs that an applied statistician is now forced to make. These examples include variance estimation at scale, feedback in system-deployed statistical models, and working with high-dimensional "heavy-tailed" factors.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2013 program
|
2013 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.