Abstract:
|
Publicly available genetic summary data is used in research and clinical settings to identify potential causal variants, build polygenic scores, and leverage common controls. However, the utility of publicly available data is limited for understudied and admixed populations as genetic summary data can mask population structure. The Summix method was developed to accurately and precisely estimate and adjust for ancestry in genetic summary data at the continental ancestry level. Here, we assess the ability of Summix to accurately and precisely estimate fine-scale ancestry in genetic summary data from gnomADv3.1.2 genome continental ancestries (e.g. East Asian, African) using up to 52 fine-scale reference groups. We find that Summix estimates fine-scale ancestry with impressive precision given sufficiently large reference sample sizes. For example, we estimate fine-scale East Asian ancestries with N reference>100 within .3% accuracy and precision. The ability of Summix to estimate fine-scale ancestry will increase the utility of publicly available genetic summary data for individuals and studies that do not exactly match the fine-scale ancestry currently available.
|