Abstract:
|
While electronic health records (EHR) present a rich and promising data source for conducting observational research, they are highly susceptible to missingness due to the complex process by which EHR are collected and generated. Even worse, data in EHR is frequently missing not at random (MNAR); e.g., whether a given laboratory test is ordered is often correlated with the expected value of the test. This is extremely common in EHR data, as patients are seen and measured more often when they are sick than when they are healthy. Building off a novel framework for handling missing data in EHR based on a modularization of the data provenance (i.e., the process by which data is observed in EHR), we present a method for localizing sensitivity to MNAR data to specific decisions or actions made by patients, their healthcare providers, or the larger healthcare system along the sequence of events defined by the data provenance. We conclude with strategies for interpreting the results of sensitivity analyses for MNAR data.
|