Abstract:
|
Existing approaches for analyzing changes in relative isoform usage between conditions, known as differential splicing (DS), often have speed and scalability issues as the number of samples increases. We propose overcoming these issues by using a method designed to analyze compositional data. Using human RNA-seq data quantified by Salmon, our compositional regression approach results in a greater than 60-fold speed improvement over DRIMSeq, with similar or improved performance across conditions. Additionally, because isoform-level expression quantification estimates from programs such as Salmon and RSEM are computationally derived, the uncertainty in such estimates are often ignored in downstream analyses. To account for this uncertainty, we think of testing for DS as a measurement error problem and apply our compositional regression approach to each set of Gibbs samples. We then combine these results using a multiple imputation approach, and show that our approach leads to improvements in performance under various conditions. We discuss future directions directly incorporating summaries derived from the Gibbs samples into a modeling approach for improved efficiency.
|