Abstract:
|
Whole genome sequencing analysis is challenged by analysis of rare variants. Traditional single SNP tests with rare variants are subject to poor power. Methods that test for association by aggregating the test statistics of multiple rare variants together in a genetic region are popular. These existing methods for rare variant analysis, such as SKAT, have good power when the signals are dense in the set of SNPs tested, but can have poor power when the signals are sparse. In contrast, thresholding methods for signal detection, such as higher criticism and Berk-Jones methods, have good power in the presence of sparse signals. However, they rely on the single SNP test statistics to behave well as normally distributed asymptotically. The normality assumption of the individual test statistics does not hold in the presence of rare variants for binary phenotype and yields incorrect type I error rates. We propose a rare variant higher criticism approach to sparse signal detection that has higher power than the existing aggregating methods, with the correct size.
|