Abstract:
|
This paper considers testing procedures for screening large genome-wide data, where we exam hundreds of thousands of genetic variants, e.g., single nucleotide polymorphisms (SNP), on a quantitative phenotype. We screen the whole genome by SNP sets and propose a new test statistic that is based on joint effect of multiple SNPs. The test incorporates correlations between variables and is defined as the maximum among a set of multiple SNPs. The limiting null distribution of the test statistic and the power of the test are derived. The test is shown to be more powerful than the minimum p-value method, which is the most commonly used approach in genome-wide screening, under certain conditions. The proposed test is compared with other existing methods, including the Higher Criticism (HC) test and the sequence kernel association test (SKAT), through simulations and analysis of a real genome data set. For typical genome-wide data, where effects of individual SNPs are weak, the proposed test is more advantageous and clearly outperforms the other methods in the literature.
|