Abstract Details
Activity Number:
|
499
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 7, 2013 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Survey Research Methods Section
|
Abstract - #307813 |
Title:
|
Analysis of Large Survey Data Sets Using Dynamically Generated SQL
|
Author(s):
|
Thomas Lumley*+
|
Companies:
|
University of Auckland
|
Keywords:
|
complex sampling ;
American Community Survey ;
relational database ;
big data ;
statistical computing ;
statistical graphics
|
Abstract:
|
Most complex surveys can be analysed efficiently by software that loads all the data into memory but there are a small number of large surveys where loading all the data into memory is inefficient or infeasible on typical laptop or desktop computers. For example, the American Community Survey (ACS) includes 3,000,000 people per year and 80 sets of replicate weights, and the Nationwide Emergency Department Sample (NEDS) includes more than 25,000,000 hospital visit records per year. I will describe a computational architecture where computations involving arithmetic on full-size vectors are performed by the MonetDB database engine, controlled by dynamically-generated SQL code. For most computations, only summary statistics need to be transferred to R for computations not available in SQL, in a few cases the whole data set is needed but can be transferred in small chunks. The system supports means, totals, medians, linear and loglinear models, smoothers and scatterplots, and is fast enough to allow interactive analysis of ACS datasets on a modern laptop.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2013 program
|
2013 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.