Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 155 - Contributed Poster Presentations: Mental Health Statistics Section
Type: Contributed
Date/Time: Monday, August 8, 2022 : 10:30 AM to 12:20 PM
Sponsor: Mental Health Statistics Section
Abstract #323257
Title: Mining for Equitable, Intelligent Health: Simulating Missing Data in Electronic Health Records
Author(s): Emily Getzen* and Qi Long
Companies: University of Pennsylvania and Upenn
Keywords: Electronic Health Records; NLP; Missing Data; Knowledge Graph

Electronic health records (EHRs) are collected as a routine part of healthcare delivery and have great potential to be utilized to improve patient health outcomes. In their complex form, they do not come in a neat data matrix with well-defined features. Instead, the data can be in the form of sequences of medical codes ordered by their time stamp. We introduce a means to simulate missing data in EHRs of this form– namely, we introduce mechanisms to simulate Missing Completely at Random, Missing At Random, and Missing Not At Random. We account for potentially causal relationships between medical events by incorporating the use of a medical knowledge graph to cluster related events. We also assess the impact of missing data on various marginalized groups on disease prediction models. We find that the use of the knowledge graph to simulate missing data has a significant impact on disease prediction models, thus illustrating the need to account for potentially causal relationships when simulating missing data in complex EHRs.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program