Abstract:
|
Advancements in next-generation sequencing technologies have led to the establishment of large-scale whole genome sequencing (WGS) studies. Before conducting genome-wide association analyses on these studies, researchers step through a series of quality control procedures to remove low quality variants. One of these steps is assessing the Hardy-Weinberg Equilibrium (HWE) assumption where deviations in HWE may suggest problems with genotyping, inbreeding, and population stratification. While the chi-square goodness-of-fit test and the exact test of genotypic proportions have been commonly used to assess HWE, these tests are inappropriate for large-scale WGS studies, which contain population structure and relatedness. Currently, there are no existing methods that account for both of these when testing for HWE. We propose a HWE test using the generalized estimating equation that accounts for population structure with principal components and the relationship among samples with a genetic relationship matrix. We show via simulations and a data application with population structure and relatedness that our method appropriately controls for type-I error and has high statistical power.
|