Online Program Home
  My Program

Abstract Details

Activity Number: 336 - Next- Generation Sequencing
Type: Contributed
Date/Time: Tuesday, August 1, 2017 : 10:30 AM to 12:20 PM
Sponsor: Biometrics Section
Abstract #323771 View Presentation
Title: Gene Expression Variability and the Analysis of Large-Scale RNA-Seq Studies with the MDSeq
Author(s): Z. John Daye* and Di Ran
Companies: None and University of Arizona
Keywords: Differential analysis ; Coefficient of dispersion ; Generalized linear model ; Negative binomial ; Over-dispersed data ; Next-generation sequencing
Abstract:

Rapidly decreasing cost of next-generation sequencing has led to the recent availability of large-scale RNA-seq data, that empowers the analysis of gene expression variability, in addition to gene expression means. In this paper, we present the MDSeq, based on the coefficient of dispersion, to provide robust and computationally efficient analysis of both gene expression means and variability on RNA-seq counts. The MDSeq utilizes a novel reparametrization of the negative binomial to provide flexible generalized linear models (GLMs) on both the mean and dispersion. We address challenges of analyzing large-scale RNA-seq data via several new developments to provide a comprehensive toolset that models technical excess zeros, identifies outliers efficiently, and evaluates differential expressions at biologically interesting levels. We evaluated performances of the MDSeq using simulated data when the ground truths are known. Results suggest that the MDSeq often outperforms current methods for the analysis of gene expression mean and variability. Moreover, the MDSeq is applied in two real RNA-seq studies, in which we identified functionally relevant genes and gene pathways.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association