JSM 2014 Home
Online Program Home
My Program

Abstract Details

Activity Number: 84
Type: Contributed
Date/Time: Sunday, August 3, 2014 : 4:00 PM to 5:50 PM
Sponsor: Section on Statistics in Epidemiology
Abstract #313698
Title: Bayesian Topic Models for Soft Clustering of Metagenomic Count Data
Author(s): Daniel Conn*+
Companies: University of California, Los Angeles
Keywords: Metagenomics ; Latent Dirichlet Allocation ; Hierarchical Dirichlet Process ; Bayesian Topic Model
Abstract:

Advances in next generation sequencing technologies produce as output high dimensional matrices of count data representing the relative abundance of thousands of species of bacteria. A common starting point in the analysis of high dimensional data is to search for lower dimensional structure. Soft clustering models are a particular type of dimension reduction technique that has garnered interest in recent years. Rather than assuming each sample falls into one and only one distinct cluster, these models allow samples to be partially in one cluster and partially in another cluster. We believe Bayesian topic models such as the latent Dirichlet allocation (LDA) and hierarchical Dirichlet process (HDP), can be used as soft clustering models in the context of metagenomics. The HDP model has the further advantage of letting the data determine the number of clusters. In topic analysis, the data is summarized by a table of word counts. In metagenomics, the data is summarized by a table of taxonomy counts. Despite this superficial similarity, we will discuss how the usual priors for the LDA and HDP model should be restructured to achieve scientifically relevant results.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program




2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.