Abstract:
|
It has become increasingly popular to perform set-based inference on genetic markers instead of studying them individually. However, a significant challenge is that the associations between an outcome and markers in a set are expected to be sparse and weak, and existing methods may not have the necessary power to detect such effects. Motivated by the Berk-Jones (BJ) statistic, which is notable for its strong asymptotic properties in detecting rare-weak signals, we propose a new test for association between a SNP-set and an outcome: the Generalized Berk-Jones (GBJ) statistic. Our GBJ statistic modifies the standard BJ to explicitly account for correlation between markers in a set, thus greatly increasing the power of the test when applied to correlated SNPs. We also provide a computationally efficient analytical p-value calculation for our method. The advantages of GBJ are demonstrated through rejection region analysis, simulation, and application to gene-level and pathway-level analyses of breast cancer data.
|