Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 184 - Recent Advances in Statistical Machine Learning
Type: Invited
Date/Time: Tuesday, August 10, 2021 : 1:30 PM to 3:20 PM
Sponsor: IMS
Abstract #316639
Title: High-Dimensional Principle Component Analysis with Heterogeneous Missingness
Author(s): Ziwei Zhu * and Tengyao Wang and Richard J. Samworth
Companies: University of Michigan, Ann Arbor and University College London and University of Cambridge
Keywords: Missing Data; PCA; Heterogeneous Missingness; Missing Completely at Random; High-Dimensional Statistics
Abstract:

We study the problem of high-dimensional Principal Component Analysis (PCA) with missing observations. Our main contribution is a new method, which we call primePCA, that is designed to cope with situations where observations may be missing in a heterogeneous manner. Given a good initialiser, primePCA iteratively projects the observed entries of the data matrix onto the column space of our current estimate to impute the missing entries, and then updates our estimate by computing the leading right singular space of the imputed data matrix. When the true principal components satisfy an incoherence condition and the signal is not too small, the error of primePCA provably converges to zero at a geometric rate. An important feature of our theoretical guarantees is that they depend on average, as opposed to worst-case, properties of the missingness mechanism. Our numerical studies on both simulated and real data reveal that primePCA exhibits very encouraging performance across a wide range of scenarios, including settings where the data are not Missing Completely At Random.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program