Activity Number:
|
320
- Electronic Health Records, Causal Inference and Miscellaneous
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 11, 2021 : 3:30 PM to 5:20 PM
|
Sponsor:
|
Section on Statistics in Epidemiology
|
Abstract #318721
|
|
Title:
|
A Novel Semiparametric Approach to Analyzing Enriched Electronic Health Record Data
|
Author(s):
|
Jill Schnall* and Yizheng Wei and Yanyuan Ma and Ravi Parikh and Jinbo Chen
|
Companies:
|
University of Pennsylvania and University of South Carolina and Penn State University and University of Pennsylvania and University of Pennsylvania
|
Keywords:
|
Electronic Health Records;
Two Phase Design;
Missing Data;
Risk Prediciton;
Predictive Accuracy
|
Abstract:
|
When using electronic health records (EHRs) for clinical and translational studies, additional data is often collected from external sources to enrich the information extractable from EHRs. For example, it is common to combine biobank or patient survey data with data extracted from the EHR. Because the external data is generally only be available for a small subset of EHR patients, the integrated data would follow a monotone missingness structure. We propose a novel semiparametric method for developing and assessing models for predicting the risk of binary outcomes using data that follows this missingness pattern. Building upon the existing literature on two-phase study designs, our method allows for the efficient utilization of complete EHR data when incorporating the incomplete external data. We propose new estimators for the area under the ROC curve as well other measures for quantifying predictive accuracy in this setting. Lastly, we apply our method to an EHR dataset incorporating additional patient survey data to develop a preliminary model for predicting the risk of mortality for oncology patients in the University of Pennsylvania hospital system.
|
Authors who are presenting talks have a * after their name.