Activity Number:
|
580
|
Type:
|
Invited
|
Date/Time:
|
Wednesday, August 3, 2016 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Biometrics Section
|
Abstract #318222
|
|
Title:
|
COCACOLA: Binning Metagenomic Contigs Using Sequence COmposition, Read CoverAge, CO-Alignment, and Paired-End Read LinkAge
|
Author(s):
|
Fengzhu Sun* and Yang Lu and Ting Chen and Jed Fuhrman
|
Companies:
|
University of Southern California and University of Southern California and University of Southern California and University of Southern California
|
Keywords:
|
next generation sequencing ;
contig binning ;
k-tuple ;
non-negative matrix factorization ;
regularization
|
Abstract:
|
The advent of next-generation sequencing (NGS) technologies enables researchers to sequence complex microbial communities directly from environment. Since assembly typically produces only genome fragments, also known as contigs, instead of entire genome, it is crucial to group them into operational taxonomic units (OTUs) for further taxonomic profiling and down-streaming functional analysis. OTU clustering is also referred to as binning. We present COCACOLA, a general framework automatically bin contigs into OTUs based upon sequence composition and coverage across multiple samples using nonnegative matrix factorization with regularization. It also incorporates additional information such as co-alignment to the reference genomes and linkage of contigs provided by paired-end reads for contig binning. The effectiveness of COCACOLA is demonstrated in both simulated and real datasets in comparison to state-of-art binning approaches such as CONCOCT, GroopM, MaxBin and MetaBAT. The software is available at https://github.com/younglululu/COCACOLA
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2016 program
|