Online Program Home
My Program

Abstract Details

Activity Number: 322
Type: Contributed
Date/Time: Tuesday, August 2, 2016 : 8:30 AM to 10:20 AM
Sponsor: Section on Bayesian Statistical Science
Abstract #320118 View Presentation
Title: What's in That Ecology? A Latent Factor Dirichlet-Multinomial Model for Metagenomic Count Data
Author(s): John O'Brien*
Companies: Bowdoin College
Keywords: metagenomics ; ecology ; Dirichlet-multinomial ; latent factor ; reversible jump ; mixture model

Metagenomic sequencing allows researchers to gather abundance data rapidly and inexpensively on a nearly complete set of microbial taxa within environmental samples. How to model the structure of the ecological dynamics these data reveal is an important applied statistical problem with consequences for environmental science and medicine. The Dirichlet-multinomial mixture model (DMM) has become the gold-standard for analyzing these datasets, providing estimates of the underlying clusters giving rise to the data. However, this approach makes strong assumptions about the functional equivalence of taxa within the ecology that are often violated in practice. I show how this issue can be resolved by introducing latent factors that combine to give a Dirichlet-multinomial likelihood. Taking a Bayesian approach, I provide a reversible jump implementation that efficiently infers the latent factors. Applied to two metagenomic datasets from and a classical plankton dataset, I show that the latent factor model gives improved interpretability over the DMM and conclude with possible computational refinements.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association