Thursday, May 30

Deciphering Biological Systems via Innovative Statistical Learning Methods

Thu, May 30, 10:30 AM - 12:05 PM
Grand Ballroom I

Modeling Bias in Compositional Data (305057)

Amy Willis, University of Washington
*David Clausen, University of Washington

Keywords: microbiome, statistical learning, machine learning, batch effects, sequencing

The composition of a microbiome is an important parameter to estimate given the critical role that microbiomes play in human and environmental health. However, profiling the composition of a microbial community using high throughput sequencing methods distorts the true composition of the community. Sequencing mock communities -- artificially constructed microbiomes of known composition -- clearly illustrates that observed composition is a biased estimate of true composition, with certain taxa consistently overobserved or underobserved compared to their true relative abundance. We propose a statistical learning model for bias in compositional data, illustrating its performance on data from the Vaginal Microbiome Consortium. We show how our model can be used to correct for batch-specific biases, permitting meta-analysis of microbiome studies.

Online Program

Modeling Bias in Compositional Data (305057)

ASA Meetings Department