Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 98 - Student Paper Award and John M. Chambers Statistical Software Award
Type: Topic Contributed
Date/Time: Monday, August 8, 2022 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Graphics
Abstract #320932
Title: Core-Elements for Least Squares Estimation
Author(s): Mengyu Li* and Cheng Meng
Companies: Renmin University of China and Renmin University of China
Keywords: Element-wise subsampling; Linear regression; Least squares; Sparse matrix; Subset selection
Abstract:

The coresets approach, also called subsampling or subset selection, aims to select a subsample as a surrogate for the observed sample. Such an approach has been used pervasively in large-scale data analysis. Existing coresets methods construct the subsample using a subset of rows from the predictor matrix. Such methods can be significantly inefficient when the predictor matrix is sparse or numerically sparse. To overcome the limitation, we develop a novel element-wise subset selection approach, called core-elements. We provide a deterministic algorithm to construct the core-elements-based estimator, requiring a computational cost only at the order of O(nnz(X)+p^3), where X is the predictor matrix with p covariates and nnz denotes the number of non-zero elements. Theoretically, we show the proposed estimator is unbiased, and it approximately minimizes an upper bound of the estimation variance. We also provide a coresets-like finite sample bound for the proposed estimator with approximation guarantees. Numerical studies on various synthetic and real-world datasets demonstrate the proposed method's superior performance compared to mainstream competitors.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program