Abstract:
|
The analysis of large, administrative datasets for public health research involves many statistical and computational challenges. Much of current statistical research focuses on handling confounding in these data, and little work has dealt with missing data and the resulting selection bias. For cohort studies, a possible approach is to match on time and other covariates in order to achieve exchangeability and estimate causal treatment effects such as the average treatment effect among the treated (ATT). However, in electronic health record (EHR) data, there can be very high levels of missingness in both covariates and outcomes, leading to challenging design and analysis choices if one wishes to match. We discuss approaches for targeting the ATT when matching exposed subjects to unexposed controls on a subset of covariates, when these covariates are subject to missingness. We evaluate and compare these strategies via simulation, and illustrate in an EHR-based matched cohort study of the effect of bariatric surgery on weight outcomes.
|