Abstract:
|
Existing methods for gene set enrichment analysis tend to follow a two-stage procedure, consisting of performing multiple hypothesis testing sequentially at the individual gene level and gene set level. The weakness of such an approach is that it disregards the uncertainty of the gene-level association results. We propose a Bayesian hierarchical model for gene set enrichment analysis. By modeling the association status of each gene as a latent variable, our method carries over the uncertainty of the gene-level association analysis into enrichment estimation. By employing an empirical Bayes inference framework, the enrichment estimation can be subsequently utilized as prior information in assessing a local fdr for each gene. Our work also presents a general solution for testing multiple hypotheses with non-exchangeable data, which achieves optimal power in an asymptotic setting. In addition to simulation studies, we demonstrate our method using two real data applications: a differential gene expression analysis using the RNA-seq data from Moyerbrailean et al. 2016, and a transcriptome-wide association analysis of lipids traits using eQTL data from the GTEx project.
|