This is the program for the 2010 Joint Statistical Meetings in Vancouver, British Columbia.

Abstract Details

Activity Number: 23
Type: Topic Contributed
Date/Time: Sunday, August 1, 2010 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Graphics
Abstract - #309465
Title: Airline On-Time Data Set
Author(s): Michael Kane
Companies: Yale University
Address:
Keywords:
Abstract:

The 2009 Data Expo centered around the airline on-time performance data set. In total, the data set includes information for over 120 million flights, each with 29 variables related to flight time, delay time, and so on. In total, the uncompressed data set is about 12 gigabytes in size. Exploratory data analysis on a data set this large presents a significant challenge to R users who rely on familiar data structures and familiar statistical functions for analyses. This presentation shows an animated visualization of the data set and proposes solutions to questions, "what is the best time to fly?" and "do older planes suffer more delays?" using the high-performance computing packages bigmemory, foreach, multicore, doMC, and biglm. The goal is to elucidate the usage of these packages in a big-data setting and show how easily they can be incorporated into an R users repertoire


The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2010 program




2010 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.