Abstract:
|
While electronic health records (EHR) offer researchers rich data on large populations over long periods of time, the potential for selection bias is high when analyses are restricted to patients with complete data. Approaching selection bias as a missing data problem, one could apply standard methods. However, these methods generally fail to address the complexity and heterogeneity of EHR data, particularly the interplay of numerous decisions by patients, physicians, and health systems that collectively determine whether complete data is observed. Viewing each such decision as a distinct missingness sub-mechanism, we develop a flexible and scalable framework for estimation and inference in EHR-based research settings that blends inverse probability weighting and multiple imputation. In doing so, one can better align the consideration of missingness assumptions and the analysis to the complexity of the EHR data. The proposed framework is illustrated using data from PROMISE, an EHR-based study of long-term diabetes outcomes following bariatric surgery.
|