Abstract:
|
It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi’omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimal combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin2, uses linear models to accommodate a wide variety of modern epidemiological study designs including cross-sectional and longitudinal. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta’omic associations can be challenging. These simulation studies reveal that MaAsLin2’s linear model preserves statistical power in the presence of repeated measures and multiple covariates while accounting for the nuances of meta’omic features and controlling false discovery.
|