Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 320 - Electronic Health Records, Causal Inference and Miscellaneous
Type: Contributed
Date/Time: Wednesday, August 11, 2021 : 3:30 PM to 5:20 PM
Sponsor: Section on Statistics in Epidemiology
Abstract #318721
Title: A Novel Semiparametric Approach to Analyzing Enriched Electronic Health Record Data
Author(s): Jill Schnall* and Yizheng Wei and Yanyuan Ma and Ravi Parikh and Jinbo Chen
Companies: University of Pennsylvania and University of South Carolina and Penn State University and University of Pennsylvania and University of Pennsylvania
Keywords: Electronic Health Records; Two Phase Design; Missing Data; Risk Prediciton; Predictive Accuracy

When using electronic health records (EHRs) for clinical and translational studies, additional data is often collected from external sources to enrich the information extractable from EHRs. For example, it is common to combine biobank or patient survey data with data extracted from the EHR. Because the external data is generally only be available for a small subset of EHR patients, the integrated data would follow a monotone missingness structure. We propose a novel semiparametric method for developing and assessing models for predicting the risk of binary outcomes using data that follows this missingness pattern. Building upon the existing literature on two-phase study designs, our method allows for the efficient utilization of complete EHR data when incorporating the incomplete external data. We propose new estimators for the area under the ROC curve as well other measures for quantifying predictive accuracy in this setting. Lastly, we apply our method to an EHR dataset incorporating additional patient survey data to develop a preliminary model for predicting the risk of mortality for oncology patients in the University of Pennsylvania hospital system.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program