Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 43 - Statistical Genetics II – New Models for Complex Study Designs
Type: Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #313750
Title: A SVM-Based Ancestry Inference Method for Large-Scale Genetic Data
Author(s): Zhennan Zhu* and Ani Manichaikul and Maria Murach and Catherine Robertson and Suna Onengut-Gumuscu and Stephen Rich and Wei-Min Chen
Companies: University of Virginia and University of Virginia and University of Virginia and University of Virginia and University of Virginia and University of Virginia and University of Virginia
Keywords: Ancestry Inference; KING; SVM; Machine Learning
Abstract:

Ancestry inference in genetic studies has been routinely performed for the purpose of quality control and association analyses. We present our support-vector-machine (SVM)-based method to identify the most likely ancestral group(s) for an individual by leveraging known ancestry in a reference dataset (e.g., the 1000 Genomes Project data). Our method involves first projecting each study sample to the principal component (PC) space of a reference dataset, followed by training and classifying the ancestry of each study sample using an SVM algorithm. This algorithm has been integrated in the computationally efficient tool, KING, and the implementation is scalable to large datasets containing over one million individuals. We assessed the performance of our algorithm using 13,181 subjects who were genotyped with the Illumina HumanCoreExome Beadarray as well as the Illumina ImmunoChip Beadarray. We predicted ancestry for 469,660 subjects in the UK Biobank. Of 441,441 reporting white ethnicity, 99.9% were classified as European; of 10,971 reporting Asian ethnicity 97.0% were classified as South or East Asian; and of 7,637 reporting black ethnicity, 97.2% were classified as African.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program