Determining differential gene expression in two or more sample groups is of great biomedical interest in understanding the genetic causes of diseases and health conditions and evaluating efficacy of genetic treatments. Gene set testing is a relatively new method of testing for differential expression between sample groups by creating groups of functionally related genes called gene sets. In this research, we compared the statistical power and false discovery rate of the following gene set test methods: mvGST, ROAST, CAMERA, ROMER, GlobalTest, GSA, PAGE, SAFE, sigPathway, and GSEAlm.
We developed a simulation framework to generate datasets that are both biologically relevant and representative of actual gene expression data. We identified several biological parameters of interest and determined realistic values for each of them by either sampling real gene expression data sets or literature review. We then identified 5 interesting parameter pairings and tested each combination of parameter values with either 50 or 100 simulated data sets to determine how power and FDR vary as a function of each parameter as well as identify possible interactions between parameters.