Abstract:
|
Missing data is a prevalent issue in observational studies using EHR and registry data. It presents unique changes for statistical inference, especially causal inference. Inappropriately handling missing data in causal inference could result in inefficiency due to a loss in sample size and potentially biased causal estimation. Besides missing data problems, observational data structures typically have a mixture of heterogeneous clusters that cannot be adequately modeled by a standard parametric model. To address these problems, we introduce a Bayesian nonparametric causal model to estimate causal effects with missing data. The proposed approach can simultaneously impute missing values, account for multiple outcomes, and estimate causal effects under the potential outcomes framework. Simulation studies comparing the performance of our method to existing causal inference approaches show that our method produces the most accurate causal effect estimates. Two case studies are conducted applying our method to evaluate the comparative effectiveness of treatments for chronic disease management in juvenile idiopathic arthritis and cystic fibrosis.
|