Online Program Home
My Program

Abstract Details

Activity Number: 652 - Genomics, Metabolomics, Microbiome and NextGen Sequencing
Type: Contributed
Date/Time: Thursday, August 1, 2019 : 10:30 AM to 12:20 PM
Sponsor: Biometrics Section
Abstract #304543 Presentation
Title: Model-Based Clustering of Illumina Microbiome Amplicon Sequence Data
Author(s): Xiyu Peng* and Karin Dorman
Companies: Iowa State University and Iowa State University
Keywords: Amplicon Sequencing; Microbiome; Mixture Model; Model-based Clustering

Next-generation amplicon sequencing is a powerful tool for investigating microbial communities. One main challenge is to distinguish true biological variants from errors caused by PCR and sequencing. In the traditional analysis pipeline, such errors are eliminated by clustering reads within a sequence similarity threshold, usually 97%, and constructing operational taxonomic units (OTUs). However, the arbitrary threshold can lead to low resolution and high false positive rates. Here, we introduce AmpliCI, a reference-free, model-based method for rapidly resolving the number, abundance and identity of error-free sequences in massive Illumina amplicon datasets. AmpliCI takes into account quality information and allows the data, rather than an arbitrary threshold or an external database, to drive the conclusions. AmpliCI estimates a mixture model, using a greedy strategy to gradually select error-free sequences while approximately maximizing the likelihood. In a simulation study, our method achieves better accuracy than competing methods, especially on communities of closely related species. AmpliCI also shows comparable or better accuracy when analyzing real mock datasets.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program