Abstract:
|
For multinormal data with monotone missing patterns, parametric regression is the method of choice. For similar non-multinormal data, nonparametric methods based on propensity scoring have been recommended. For multinormal data with arbitrary patterns of missingness, Markov Chain Monte Carlo (MCMC) imputation is available. For general data types and missing data patterns typical of survey data, hotdeck is often used. However, hotdeck based procedures can not account for relationships among many variables.
The multiple additive regression trees (MART) method was recently developed for constructing general data models from a wide range of data types for categorical or continuous dependent variables. MART handles missing values automatically in a statistically principled way. Since MART-based regression models can be constructed from variables of any type, regardless of the pattern of missing data, MART-based modeling offers a natural extension to regression modeling. Here we compare the results of MART based imputation with results from hotdeck based imputation for simulated and actual survey data sets.
|