Online Program Home
My Program

Abstract Details

Activity Number: 469
Type: Invited
Date/Time: Wednesday, August 3, 2016 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #318150
Title: State Space Models for the NGS Pipeline
Author(s): Karin S. Dorman* and Xin Yin and Vahid Noroozi and Aditya Ramamoorthy
Companies: Iowa State University and Iowa State University and Iowa State University and Iowa State University
Keywords: bioinformatics ; next generation sequencing ; Hidden Markov Model ; de Bruijn graph
Abstract:

The age-old wisdom "garbage in, garbage out" underscores any analysis using next-generation sequencing (NGS) data. Pipeline components are concatenated serially with only minimal transmission of uncertainties and information. For example, base callers rarely utilize information about the underlying genome sequence, whereas error correction methods seldom utilize the error properties of sequencing. We demonstrate that integrated, probabilistic approaches that combine steps in the pipeline perform better than sequential analysis. Others have improved pipeline operations by borrowing information from alignment to known reference genome(s). Our combined approach specifically capitalizes on genome information, but without use of a known reference genome to avoid biasing against the unknown. We use a Hidden Markov Model on a sparse de Bruijn graph, where the transitions model genetic content and the emissions model observable data. The combined probabilistic approach removes more errors and more accurately transmits information through the pipeline.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association