Online Program Home
My Program

Abstract Details

Activity Number: 185 - Contributed Poster Presentations: International Statistical Institute
Type: Contributed
Date/Time: Monday, July 30, 2018 : 10:30 AM to 12:20 PM
Sponsor: International Statistical Institute
Abstract #329747
Title: A-Optimal Subsampling for Big Data Generalized Estimating Equations
Author(s): Thomas Cheung*
Companies: Purdue University - Indianapolis
Keywords: A-optimality; big data; leverage scores; least squares; random sampling; weighted bootstrap estimator
Abstract:

In this poster, we work on the framework of Chatterjee and Bose (2005) of generalized bootstrap technique for estimators obtained by solving estimating equations. We consider the case that the sample size n is extremely large and the estimate ? ?_n is not available or time-consuming to obtain it. Typically, the dimension p will also be very large. Our approach in tackling this big data estimation problem is A-optimal subsampling, that is, we seek the A-optimal sampling distribution on the data points and use it to take a subsample of size r as a surrogate of the whole sample. We approximate the estimate ? ?_n by the subsampling generalized bootstrap estimate ? ?_r^* which solves the corresponding estimating equations. We show that the A-optimal weights is more effective than generalized bootstrap weights suggested by Chatterjee and Bose, and the frequently used non-uniform sampling distribution the leverage scores in drawing important information. This is demonstrated by simulations of a Cox proportional hazard regression model which shows that A-optimal gives the minimum mean square error of the estimate.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program