Abstract:
|
National databases provide rich sources to identify relevant predictors for diseases and clinical outcomes. These databases, however, often lack strict guidelines on reporting of patient demographics and outcomes, leading to missing data. In this paper, we compare several methods to control for bias due to missing data in their ability to identify predictors using a lasso algorithm for model selection. Traditionally speaking, multiple imputation is problematic in using shrinkage estimators due to the potential for model variation across imputations. Recent advances have been made to correct for this, namely stacked and grouped multiple imputation. Here, we compare these newer methods to complete case analysis and a proposed inverse probability weighting approach. Through extensive simulation, we demonstrate that multiple imputation can be overly sensitive when the outcome of interest is missing, rather than potential exposures, but may be influenced by the method used for tuning parameter selection. This is evident both with continuous and binary exposure data. We apply these methods to data from the National Trauma Data Bank to identify predictors of open tibia fracture outcomes.
|