Abstract:
|
Biomedical meta-analysis, biomarker discovery, and population structure determination have all benefited from statistical methods development. Human microbiome data present many of the same research challenges, but with new and emerging statistical considerations. In particular, inflammatory bowel disease (IBD) is an important microbiome-linked condition that is heterogeneous in clinical phenotypes and gut microbial profiles. There is no consensus on microbial ecotypes or patterns of variation explaining this heterogeneity. By extending recent biostatistical work in cancer gene expression, we characterized consistent population structure in patients' gut microbiomes through meta-analysis of seven IBD studies. Evaluation of data handling practices identified those most sensitive to biological variation and robust to batch and technical differences, including known effects of Bacteroides and Prevotella microbes. Multiple unsupervised clustering methods, combined with different clustering strength metrics, agreed on a lack of discrete structures in the IBD gut microbiome. Supervised random-forest modelling proved accurate across studies for classifying within-IBD heterogeneity.
|