JSM 2013 Home
Online Program Home
My Program

Abstract Details

Activity Number: 560
Type: Roundtables
Date/Time: Wednesday, August 7, 2013 : 12:30 PM to 1:50 PM
Sponsor: Section on Statistical Computing
Abstract - #308406
Title: Divide and Recombine: Statistical Theory, Methods, and Visualization for Large Complex Data
Author(s): Bowei Xi*+
Companies: Purdue University
Keywords: large complex data ; parallel computing ; divide and recombine ; Hadoop
Abstract:

Large complex data challenge our statistical theory, statistical and machine-learning methods, visualization methods, statistical models, computational algorithms, and computational environments. There are two goals: maintain comprehensive detailed analysis that preserves the information in the data and maintain an ability to analyze the data wholly from within an interactive language such as R. D&R is an approach that seeks to achieve these goals. All the data are divided into subsets, usually non-overlapping; statistical procedures are applied to each subset; outputs from subsets are recombined. D&R estimators have different theoretical properties from the direct whole-data estimators and are typically less efficient. However, if the division is carried out using statistical thinking (e.g., stratified sampling) to make each subset as representative as possible, then results can be excellent. This approach leads to the simplest possible parallel computation, allowing a merger of an interactive language with a distributed, parallel compute engine and distributed database like Hadoop. So D&R turns the problem of analyzing large complex data into a statistical problem.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2013 program




2013 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.