Abstract:
|
Electronic Health Records (EHR) data provide unique opportunities for comparative effectiveness research (CER) as a result of their rich collection of information for large patient populations. Selection due to incomplete data is an underappreciated source of bias in analyzing EHR data. When framed as a missing-data problem, standard methods are often applied to control for selection bias. In EHR-based studies, however, data provenance involves the interplay of many clinical decisions made by patients, health care providers, and the health system; thus standard methods fail to capture the complexity of the data. In this paper, we use a novel framework for selection bias in EHR-based CER that allows for a hierarchy of missingness mechanisms to inform an inverse-probability weighted estimator that better aligns with the complex nature of EHR data. We show that this estimator is consistent and asymptotically normal. Based off extensive simulations, a key insight is the bias-variance trade-off in using this framework when the data provenance is functionally misspecified. We use this approach to adjust for selection in an on-going, multi-site EHR-based study of bariatric surgery on BMI.
|