Abstract:
|
Population stratification, the systematic difference in allele frequencies between subpopulations, is a profound confounding factor in genome-wide association studies (GWAS) leading to inflated type I error rates. There are efficient approaches developed to adjust for population stratification in GWAS of common variants. However, those methods may fail for rare variants, which tend to display local spatial structure. Here, we proposed a procedure to control for population stratification in case of rare variants. We first applied the Patient Rule-Induction Method (PRIM) to identify clustered subjects with extreme phenotypes. Those subjects were deemed as outliers and removed for further investigation. GWAS was then carried out by linear mixed model controlling for principal components of genome-wide data. The simulation studies showed that the proposed procedure well controlled the type I error rates, and yielded higher power than other existing methods. We applied the procedure to GWAS of lipids levels in Dallas Heart Study to demonstrate its efficiency.
|