This is the program for the 2010 Joint Statistical Meetings in Vancouver, British Columbia.
Abstract Details
Activity Number:
|
23
|
Type:
|
Topic Contributed
|
Date/Time:
|
Sunday, August 1, 2010 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Graphics
|
Abstract - #309465 |
Title:
|
Airline On-Time Data Set
|
Author(s):
|
Michael Kane
|
Companies:
|
Yale University
|
Address:
|
|
Keywords:
|
|
Abstract:
|
The 2009 Data Expo centered around the airline on-time performance data set. In total, the data set includes information for over 120 million flights, each with 29 variables related to flight time, delay time, and so on. In total, the uncompressed data set is about 12 gigabytes in size. Exploratory data analysis on a data set this large presents a significant challenge to R users who rely on familiar data structures and familiar statistical functions for analyses. This presentation shows an animated visualization of the data set and proposes solutions to questions, "what is the best time to fly?" and "do older planes suffer more delays?" using the high-performance computing packages bigmemory, foreach, multicore, doMC, and biglm. The goal is to elucidate the usage of these packages in a big-data setting and show how easily they can be incorporated into an R users repertoire
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2010 program
|
2010 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.