Abstract:
|
When at least 1 predictor variable has a missing value, all statistical packages for regression modeling default to discarding the affected record(s), potentially introducing bias by throwing away incomplete records. Multiple imputation (MI) procedures attempt to preserve all the records in the dataset by filling-in the missing plausible values with values derived using observed values. After missing values for a predictor have been imputed, the dataset is then analyzed using standard methods as if all for values were observed. MIs estimate the missing values from the whole range of variable data by estimating plausible values to fill-in for the missing values for affected variables. When 1 factor, e.g, sex, has distinct heterogeneity over the variables with missing values, ignoring the factor by imputing over the full range could affect the results compared to imputing within the factor-specific range of the predictor variable. We compared Cox regression model results for cardiovascular disease comparing within-strata MIs vs. full-range MIs, i.e, ignoring sex where the variable with missing values was heterogeneous between men vs. women under a fixed proportion of right censoring.
|