Abstract:
|
Meta-analysis combining multiple transcriptomic studies increases statistical power and accuracy in detecting differentially expressed genes. As the next-generation sequencing experiments become mature and affordable, increasing number of RNA-seq datasets are available in the public domain. A naive approach to combine multiple RNA-seq studies is to apply differential analysis tools to each study and then to combine the summary statistics by conventional meta-analysis methods. Such a two-stage approach loses statistical power, especially for genes with short length or low expression abundance. We propose a full Bayesian hierarchical model (namely, BayesMetaSeq) for RNA-seq meta-analysis by modeling count data, integrating information across studies, and modeling potentially heterogeneous differential signals across studies via latent variables. A Dirichlet process mixture (DPM) prior is further applied on the latent variables to provide categorization of detected biomarkers according to their differential expression patterns across studies. Simulations and a real application on HIV transgenic rats demonstrate improved sensitivity, accuracy and biological findings of the method.
|