JSM 2012 Home

JSM 2012 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Online Program Home

Abstract Details

Activity Number: 174
Type: Contributed
Date/Time: Monday, July 30, 2012 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Computing
Abstract - #306355
Title: Computation of Order Statistics for Very Large and Distributed Data
Author(s): Damir Spisic*+ and Graham J. Wills and Fan Li
Companies: IBM SPSS Predictive Analytics and IBM SPSS Predictive Analytics and IBM SPSS Predictive Analytics
Address: 233 S. Wacker Dr. 11th Fl., Chicago, IL, 60606, United States
Keywords: order statistics ; distributed data ; robust data preparation

Computation of order statistics along with other descriptive summaries is critical for robust data assessment and its preparation for further analyses. In comparison to moment-based statistics, computation of order statistics presents a challenge for large and distributed data sources as well as for the streaming data. Available methods produce approximate order statistics with deterministic bounds within a single data pass. However, these methods estimate bounds for rank of order statistics and do not limit the amount of memory accessed for computation. Robust data preparation often depends on accurate order statistics values (i.e. location) rather than their rank. We propose an algorithm that approximates order statistics and estimates their location bounds in a single data pass. It limits memory usage at the same time providing a trade-off between accuracy and use of the computational resources. This is achieved through dynamic data binning that is both scalable and effective in providing tight deterministic bounds for streaming data. Bins from multiple data sources are subsequently combined for estimating the overall order statistics and their deterministic location bounds.

The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2012 program

2012 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.