Abstract:
|
Predictive mean matching (PMM) is a commonly used method to impute missing data, combining parametric modelling with hot-deck imputation, making it more robust to model misspecification than purely parametric methods. PMM imputes missing values using a random hot deck procedure with a specified distance metric, and the distances are computed by fitting a linear regression model to the data. While the actual imputations do not follow the linear model prediction, the method's success still largely hinges on some degree of linearity in the relationship between the variables and the linear regression model accurately capturing the ranking of the observed units to form the donor pool. The Penalized Spline of Propensity Prediction method (PSPP) is a semi-parametric method for imputation that has been shown to provide doubly-robust estimates of the mean in a variety of settings. However, it makes the assumption that the outcome variable is Normally distributed conditional on the propensity score and the other covariates, which can lead to inaccurate results when the data are non-continuous or non-Normal. We propose a method that combines PSPP with PMM to impute univariate missing data.
|