JSM 2017 Online Program

Activity Number:	304 - Statistical Learning: Dimension Reduction
Type:	Contributed
Date/Time:	Tuesday, August 1, 2017 : 8:30 AM to 10:20 AM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #323018	View Presentation
Title:	Sparse Principal Component Analysis with Missing Observations
Author(s):	Seyoung Park* and Hongyu Zhao
Companies:	and Yale University
Keywords:	PCA ; Missing data ; High dimensional ; Sparsity
Abstract:	Principal component analysis (PCA) is commonly used statistical method in a wide range of applications. However, it does not work well when the number of features is larger than the sample size. Moreover, it is unclear how to properly handle incomplete data in PCA analysis. We consider the estimation of the sparse principal subspace in the high dimensional setting with missing data. We propose a two step estimation procedure, and establish the rates of convergence for estimating the principal subspace. Simulated examples show its competitive performance compared to existing sparse PCA methods. We also apply the method to single-cell data, which typically have many missing values, and show that the proposed method can better distinguish cell types than other PCA methods.

Authors who are presenting talks have a * after their name.