Abstract:
|
Detailed breakdown of totals items are collected in surveys. Detail proportions can vary greatly by sample unit, and the multinomial distributions can likewise vary by imputation cell. Consequently, although it might be feasible to develop viable parametric imputation models for the total, it is challenging for the collective set of detail items. Instead, a common practice is to use some form of hot deck imputation to match donor and recipient records, then impute the donor's complete set of proportions. Nearest neighbor imputation is useful when the set of proportions is correlated with unit size. This approach preserves the correlation between the detailed items within imputation cell, as long as the number of donors is greater than or equal to the number of recipients. Unfortunately, this condition often does not hold in practice. Collapsing imputation cells is not an attractive alternative. We explore unrestricted usage of the donor records in the original cell versus the usage of a random draw from the donor record's multinomial distribution via a limited simulation study using historic data in selected industries from the 2012 Economic Census.
|