Abstract:
|
Sample normalization is an essential step with considerable impact on the analysis of differentially expressed genes (DEGs) in high-throughput RNA sequencing (RNA-seq) experiments. Although there are numerous methods for normalizing read counts to allow for comparative analysis, it remains a challenge to maintain the actual false discovery rate (FDR) below a nominal level. To specifically address this issue, we developed an UQ-pgQ2 normalization method, which is the median per-gene normalization (pgQ2) following the upper-quantile per-sample global scaling. In this work, we compared the UQ-pgQ2 method with three most commonly used methods (UQ, DESeq2 and edgeR)using two benchmarked Microarray Quality Control (MAQC) RNA-seq datasets. An additional within-group comparison based on the publically available TCGA datasets was used to further assess normalization methodologies. The results show that the UQ-pgQ2 method combined with a Wald test from DESeq2 has the smallest number of false positives given a FDR cutoff of 0.05. We conclude that our method outperforms UQ, DESeq2 and edgeR by improving the DEG specificity.
|