Abstract:
|
To correct for a large number of hypothesis tests, most researchers rely on simple multiple testing corrections. Yet, new selective inference methodologies could improve power by enabling exploration of test statistics with covariates for informative weights while retaining desired statistical guarantees. We explore the use of adaptive p-value thresholding (AdaPT, Lei & Fithian 2018), in the framework of genome-wide association studies (GWAS), with particular emphasis on schizophrenia (SCZ). We use flexible gradient boosted trees to account for covariates constructed from independent GWAS statistics from genetically-correlated bipolar disorder, the effect size of SNP genotypes on gene expression, and gene-gene coexpression captured by subnetwork membership. We demonstrate a substantial increase in power to detect SCZ associations using gene expression information from the developing human prefontal cortex (Werling et al. 2019). Importantly, our entire process for identifying enrichment and creating features with independent complementary data sources can be implemented in many different high-throughput settings to ultimately improve power.
|