Online Program Home
My Program

Abstract Details

Activity Number: 593
Type: Topic Contributed
Date/Time: Wednesday, August 3, 2016 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #321073
Title: Batch Effects in Genomics Data
Author(s): Florian Buettner* and Oliver Stegle
Companies: and EMBL-European Bioinformatics Institute
Keywords: latent variable models ; Bayesian factor analysis ; variational inference ; single-cell RNA-seq ; batch effects ; pathway analysis
Abstract:

Single-cell RNA-sequencing (scRNA-seq) allows profiling genome-wide expression heterogeneity in ensembles of cells. However, cells can vary due to both technical and biological factors, causing correlated gene expression changes. We build on sparse factor analysis models to jointly infer and account for known covariates such as batch, biological sources of variation in form of factors from a pathway database and additional confounding factors. Pathway factors are encoded using a spike-and-slab prior on the weights and a second level of factor-wise regularization is used to determine which known and hidden factors are relevant in a given dataset. We present an efficient variational inference scheme such that our model scales linearly in the number of cells and factors. In simulation studies as well as several real studies where the true sources of variation are well understood, we show that our model allows decomposing scRNA-seq data into interpretable components, thereby robustly revealing the drivers of expression heterogeneity. We illustrate the potential of our model by exploring associations between DNA methylation heterogeneity and pluripotency variation in single-cell data,


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association