Abstract:
|
Whole genome sequencing (WGS)-based association analysis of complex traits remains a tremendous challenge due to the large number of rare variants (RVs), many of which are non-trait-associated neutral ones. External biological knowledge, such as functional annotations based on the ENCODE, may be helpful in distinguishing causal RVs from neutral ones. However, each functional annotation can only provide certain aspect of the biological functions. Our knowledge to select the informative annotations a priori is limited while incorporating non-informative annotations will introduce noise and lose power. We propose a versatile and adaptive test that incorporates multiple biological annotations and is adaptive at both the annotation and variant levels, thus maintaining high power even in the presence of noninformative annotations. In addition to extensive simulations, we illustrate our proposed test using the UK10K WGS data. We identified and replicated genome-wide significant genetic loci associated with LDL, which were missed by existing RV association tests that either ignore external biological information or rely on a single source of biological knowledge.
|