Abstract:
|
Alternative splicing (AS) is a key mechanism responsible for the cellular and functional complexity in the eukaryotic transcriptome. The development of RNA-Sequencing (RNA-Seq) technology has revolutionized the analysis of differential AS, yet the size and complexity of large-scale RNA-Seq datasets continue to pose significant data analysis challenges to researchers. One such complex data structure is paired RNA-Seq data, where the data are in the form of pairs of measurements taken on matched samples across groups. We develop a new method, named PAIRADISE (PAIRed Analysis of Differential ISoform Expression), which directly models the average difference in logit exon inclusion levels between two groups via a simple and interpretable parameter. Moreover, PAIRADISE can be applied to several types of mRNA isoform variation, including alternative splicing, alternative polyadenylation, and RNA editing. In simulations, PAIRADISE consistently outperforms existing approaches. Finally, we apply PAIRADISE to several cancer datasets from The Cancer Genome Atlas, where we find significant splicing differences in tumor vs. healthy samples not previously detected by competing methods.
|