Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 15 - Subsampling: Basic Tool That Facilitates the Identification of Statistical Relationships in Big Data
Type: Topic Contributed
Date/Time: Sunday, August 7, 2022 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #322539
Title: Unweighted Estimation Based on Optimal Sample Under Measurement Constraints
Author(s): Jing Wang*
Companies: University of Connecticut
Keywords: Massive data; Generalized linear models; Martingale central limit theorem
Abstract:

To tackle massive data, subsampling is a practical approach to sift more informative data points. However, when responses are expensive to measure, developing efficient subsampling schemes is challenging, and the optimal sampling approach under measurement constraints was developed to meet this challenge. This method uses the inverses of optimal sampling probabilities to reweight the objective function, which assigns smaller weights on more important data points. Thus the estimation efficiency of the resulting estimator can be improved. In this paper, we propose an unweighted estimating procedure based on optimal subsamples to obtain a more efficient estimator. We obtain the unconditional asymptotic distribution of the estimator via martingale techniques without conditioning on the pilot estimate, which has been less investigated in existing subsampling literature. Both asymptotic results and numerical results show that the unweighted estimator is more efficient in parameter estimation.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program