Online Program Home
My Program

Abstract Details

Activity Number: 415 - Statistical Methods for Gene Expression and RNA-Seq Analysis
Type: Contributed
Date/Time: Tuesday, July 30, 2019 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #305010
Title: Can You Trust Differential Expression Methods for RNA-Seq Data Analysis ?
Author(s): Boris P Hejblum* and Marine Gauthier and Rodolphe Thiébaut and Denis Agniel
Companies: University of Bordeaux and Université de Bordeaux, Inria/Inserm, VRI and Université de Bordeaux, Inria/Inserm, VRI and RAND Corporation
Keywords: RNA-seq; False positives; Type-I error; Score test; Variance component; EBOLA

Gene expression measurement has shifted from microarrays to next generation RNA-sequencing, producing ever richer high-throughput data for transcriptomics studies. limma/voom, edgeR, and DESeq2, state-of-the-art methods for analyzing such data, can all fail to control the type-I error. As such studies grow in size and frequency, there is an urgent need for better statistical methods. We model log-normalized RNA-seq counts as continuous variables using nonparametric regression to account for their inherent heteroscedasticity, in a principled, model-free, and efficient manner. We rely on a powerful variance component score test that can deal with both covariates adjustment in complex experimental designs (e.g. longitudinal studies) and data heteroscedasticity, to identify the genes whose expression is significantly associated with one or several factors of interest. Our test statistic has a simple form and limiting distribution, which can be computed quickly. A permutation version of the test is also derived for small sample sizes. Our test exhibits very good statistical properties in simulations, and in two applications (a candidate vaccine against EBOLA, and a tuberculosis study).

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program