Online Program Home
My Program

Abstract Details

Activity Number: 181 - SPEED: Statistical Learning and Data Science Speed Session 1, Part 2
Type: Contributed
Date/Time: Monday, July 29, 2019 : 10:30 AM to 11:15 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #307533
Title: Quantile Regression Under Memory Constraint
Author(s): Yichen Zhang* and Xi Chen and Weidong Liu
Companies: New York University and New York University and Shanghai Jiaotong University
Keywords: quantile regression; divide-and-conquer; distributed inference; Bahadur representation; online streaming data; asymptotic theory
Abstract:

We study the inference problem in quantile regression (QR) for a large sample size n but under a limited memory constraint, where the memory can only store a small batch of data of size m. A natural method is the naïve divide-and-conquer approach, which splits data into batches of size m, computes the local QR estimator for each batch, and then aggregates the estimators via averaging. However, this method only works when n=o(m^2) and is computationally expensive. We propose a computationally efficient method, which only requires an initial QR estimator on a small batch of data and then successively refines the estimator via multiple rounds of aggregations. Theoretically, as long as n grows polynomially in m, we establish the asymptotic normality for the obtained estimator and show that our estimator with only a few rounds of aggregations achieves the same efficiency as the QR estimator computed on all the data. Moreover, our result allows the case that the dimensionality p goes to infinity. The proposed method can also be applied to address the QR problem under distributed computing environment (e.g., in a large-scale sensor network) or for real-time streaming data.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program