Statistical and Computational Methods for Microbiome and Metagenomics Data Analysis — Professional Development Continuing Education Course
ASA, Section on Statistics in Genomics and Genetics
High throughput sequencing technologies enable large-scale individualized characterization of the microbiome composition, functions and community dynamics. The human microbiome, defined as community of microbes in and on the human body, impacts human health and risk of disease by dynamically interacting with host diet, genetics, metabolism and environment. The resulting microbiome data together with genomics and metabolomics data can potentially be used for personalized diagnostic assessment, risk stratification, disease prevention and treatment. New computational and statistical methods are being developed to understand the function of microbial communities by integrating microbiome and other omics data. In this short course, we will give detailed presentations on the statistical and computational methods for measuring various important features of the microbiome based on shotgun metagenomic sequencing data, and how these features are used as an outcome of an intervention, as a mediator of a treatment and as a covariate to be controlled for when studying disease/exposure associations. The statistics underlying some of the most popular tools in microbiome data analysis will be presented, including bioBakery tools for meta’omic profiling and tools for microbial community profiling (MetaPhlAn, HUMAnN, Data2, DEMIC, etc), together with advanced methods for compositional data analysis and kernel-based association analysis.
Instructor(s): Curtis Huttenhower, Harvard T.H. Chan School of Public Health; Hongzhe Li, University of Pennsylvania