Abstract:
|
Culture-independent microbial community sequencing data present challenges during analysis due in part to their quantitative properties. They are typically noisy, sparse (zero-inflated), high-dimensional, and extremely non-normal, often arising in the form of either count or compositional measurements. I will discuss Bayesian models for two common types of microbiome data, taxonomic profiles (which indicate the abundances of organismal features) and ecological interactions (i.e. significant co-occurrences or co-variation). The first, SparseDOSSA, parameterizes typical microbial sequencing counts across taxa and samples and allows, in turn, realistic simulated data generation for methods development. The second, BAnOCC, infers covariance between unobserved basis features (i.e. absolute microbial abundance measurements) given compositional data (which is generated by typical sequencing data). I will conclude with comments on other types of graphical models (e.g. Gaussian processes) for microbiome epidemiology studies.
|