JSM 2014 Home
Online Program Home
My Program

Abstract Details

Activity Number: 2
Type: Invited
Date/Time: Sunday, August 3, 2014 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Graphics
Abstract #310658
Title: Designing Software for Statistical Analysis of Huge Collections of Sequencing Data
Author(s): Ben Langmead*+
Companies: Johns Hopkins University
Keywords: genomics ; scalability ; MapReduce ; alignment ; RNA-seq
Abstract:

For 4-5 years following the advent of second-generation sequencing, per-instrument throughput increased at a rate of about 4-fold per year. The challenge for computational genomics is to make it easy for genomics researchers, including statisticians, to use the very large genomics datasets that have been generated as a result. Many studies seek to distinguish a condition of interest (e.g. disease) from the background of biological variability, sequencing error, and bias. Careful study design is key, but it is also helpful to consider many datasets at once. But designing software that can easily analyze many, large sequencing datasets is a significant technical challenge. I discuss recent ideas about how to design software that can be used to study many, large sequencing datasets. I discuss how the MapReduce framework developed at Google can (or cannot) be used to scale genomics applications up to use many computers at once, in a way that confers efficiency and fault tolerance. I also describe cloud-enabled, scalable software pipelines developed by myself and others, and show how they can yield new insights into sequencing error, bias, and variability.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program




2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.