Online Program

Return to main conference page

All Times EDT

, -
Virtual
Contributed Presentations

ConQuR: Batch Effect Correction for Microbiome Data in Large-Scale Epidemiology Studies via Conditional Quantile Regression (309862)

*Wodan Ling, Fred Hutchinson Cancer Research Center 

Keywords: Batch effect removal, conditional quantile estimation, large-scale epidemiology studies, microbiome data, two-part quantile model

Integrating batches of data boosts the power of detecting associations between microbiome data and clinical variables. However, the combined microbiome data suffer from batch effects, which leads to excessive false positives and false negatives. Most of the existing strategies for microbiome batch effect removal rely on approaches originally designed for genomic analysis. Many of them use Gaussian linear or negative binomial regression models, failing to adequately address the sparsity, dispersion and heterogeneity issues in microbiome data. The strategies tailored for microbiome data can only be used for association testing or require particular types of controls/spike-ins. We developed ConQuR, a batch correction method using a two-part quantile regression model, considering both inflated zeros and complex distributional attributes of the non-zero measures. It preserves the zero-inflated integer nature of microbiome data, making the corrected data compatible with any subsequent normalization and analysis. We applied ConQuR to two real data sets and showed its success in removing batch effects and amplifying the signals of key variables in association testing and prediction.