Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 268 - Replicability and the Narrative of Scientific Research
Type: Invited
Date/Time: Wednesday, August 11, 2021 : 1:30 PM to 3:20 PM
Sponsor: American Association for the Advancement of Science
Abstract #316860
Title: Replicability and Missing Data in Deep Learning and Clinical Prediction
Author(s): Naim Rashid* and David Lim and Joseph G Ibrahim
Companies: University of North Carolina At Chapel Hill, Dept of Biostatistics and University of North Carolina At Chapel Hill, Dept of Biostatistics and UNC
Keywords: Replicability; Deep Learning; Missing Data; RNA-seq; EHR data; Clinical Prediction
Abstract:

The replicability of statistical algorithms for clinical decision-making has been of significant concern in biomedical and translational research, where multiple factors may limit the generalizability of models trained on individual studies. In the first part of this talk, we describe recent work in high dimensional data integration and meta-learning with respect to both supervised (clinical prediction) and unsupervised learning (cluster discovery). Applications to cancer subtype discovery and prediction will be discussed. In the second part of this talk we will discuss the issue of missing data in deep learning neural network and its impact on model generalizability. We introduce new methodology for principled handling of MCAR, MAR, and MNAR patterns of missingness in feed forward neural networks to improve the performance of regression and classification tasks in the presence of missing data. We show that our methodology avoids manual selection of features to model the missingness mechanism, and can flexibly handle multiple patterns of missingness across features in high dimensional data. We demonstrate the performance of our approach in simulated and real EHR datasets.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program