JSM 2013 Home
Online Program Home
My Program

Abstract Details

Activity Number: 499
Type: Contributed
Date/Time: Wednesday, August 7, 2013 : 8:30 AM to 10:20 AM
Sponsor: Survey Research Methods Section
Abstract - #307813
Title: Analysis of Large Survey Data Sets Using Dynamically Generated SQL
Author(s): Thomas Lumley*+
Companies: University of Auckland
Keywords: complex sampling ; American Community Survey ; relational database ; big data ; statistical computing ; statistical graphics
Abstract:

Most complex surveys can be analysed efficiently by software that loads all the data into memory but there are a small number of large surveys where loading all the data into memory is inefficient or infeasible on typical laptop or desktop computers. For example, the American Community Survey (ACS) includes 3,000,000 people per year and 80 sets of replicate weights, and the Nationwide Emergency Department Sample (NEDS) includes more than 25,000,000 hospital visit records per year. I will describe a computational architecture where computations involving arithmetic on full-size vectors are performed by the MonetDB database engine, controlled by dynamically-generated SQL code. For most computations, only summary statistics need to be transferred to R for computations not available in SQL, in a few cases the whole data set is needed but can be transferred in small chunks. The system supports means, totals, medians, linear and loglinear models, smoothers and scatterplots, and is fast enough to allow interactive analysis of ACS datasets on a modern laptop.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2013 program




2013 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.