Online Program Home
My Program

Abstract Details

Activity Number: 575 - Statistical Methods for Batch Effect Correction and Cell Type Deconvolution
Type: Contributed
Date/Time: Wednesday, July 31, 2019 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #306912
Title: Learning from Unobserved Covariates for Improved Classification Accuracy
Author(s): Yujia Pan* and Johann A Gagnon-Bartsch
Companies: University of Michigan and University of Michigan
Keywords: classification; high dimensional statistics; latent variables; genomics

Classification of high-throughput genomic data is challenging because the signal is often weak and sparse. Incorporating side information or additional covariates (e.g. gender, age) can lead to better predictive accuracy, but it is often the case that such information is unknown. To this end, we introduce a classifier which adaptively leverages both observed variables as well as inferred latent ones. Including these latent variables tends to improve accuracy, sometimes substantially, as illustrated on several simulated and genomic datasets. A diverse collection of genomic datasets are considered (gene expression, methylation, and SNP data), as well as a wide range of disease phenotypes (asthma, Alzheimer’s disease, tuberculosis, and schizophrenia), illustrating the broad applicability of our method.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program