Online Program Home
My Program

Abstract Details

Activity Number: 376
Type: Contributed
Date/Time: Tuesday, August 2, 2016 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #321078
Title: A Classification Model for Detecting De Novo Mutation in Primary Amenorrhea Probands
Author(s): Fuchen Liu* and Kathryn Roeder and Bernie Devlin and Aleksandar Rajkovic
Companies: Carnegie Mellon University and Carnegie Mellon University and University of Pittsburgh School of Medicine and University of Pittsburgh
Keywords: Primary Amenorrhea ; de novo mutation ; Autism ; classification in imbalanced data ; retraining ; TADA-denovo function to find risk genes
Abstract:

De novo mutations in the probands (the first affected family member) are one main kind of causes of Primary Amenorrhea and are very useful in diagnosis and treatment of this disease. However, because of the lack of parents' data, de novo mutations cannot be identified from inherited variants by Exome sequencing. In order to detect the de novo mutations from inherited variants, we started with a fully labeled (de novo or inherited) Autism Spectrum Disorder (ASD) dataset (2317 trios). We found 4 useful features about de novo mutations and then built a classification model using methods for imbalanced dataset. A new 'retraining' (or transfer learning) method for imbalanced data was also proposed to make the model fit better in Primary Amenorrhea dataset. Using this model, we found 66 possible de novo Loss of Function mutations and 230 possible de novo missense3 mutations among 7001 rare variants of 100 probands. The result fits well with the labelled part of the Primary Amenorrhea data. Based on this result, we also got a risk ranking of genes using TADA-denovo function, which was proposed by Xin He et.al in 2013.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association