Abstract:
|
In this study, we propose a novel method for RNA-seq differential expression analysis which is tolerant to data adjustment and is capable of the integration with numerous upstream and downstream analyses on mRNA abundance in RNA-seq studies. Various methods have been proposed, each with its own limitations. Our novel method incorporates information from both mRNA abundance and raw counts by modeling RPKM (reads per kilobase per million), which represents the relative abundance of mRNA transcripts, and borrowing mean-variance dependency from CPM (counts per million) as a precision weight accounting for the variability in sequencing depth. Studies on simulated data and two real datasets showed that RoMA provides an accurate quantification of mRNA abundance and a value adjustment-tolerant DE analysis with high AUC, low FDR and a desirable type I error rate. This study provides a valid strategy for mRNA abundance modeling and data analysis integration for RNA-seq studies, which will greatly facilitate the identification and interpretation of DE genes. The method is implemented in a user-friendly R package (RoMA).
|