Abstract:
|
Quantitative criteria of Hot deck imputation indicating acceptable donor pool size and proportion of missing value is required in a practical field of government statistics. We show that the accuracy of hot deck imputation could be less affected by the donor pool size, but more sensitive to the large proportion of missing value. The input data used in the simulation are random numbers according to four probability density functions, and three real data sets. The proportion of random missing value from 10-90% of each sample sizes calculated by random hot deck imputation and investigates the effects of donor pool size and proportion of missing value. As a result, the sample size effects only for the level of RMSE, mean, and SD. Regardless of sample size, in case of the proportion of missing value exceeds 30-40%, RMSE becomes larger than SD of complete data. Although Mean and SD disperse as the proportion of missing value increases, those do not deviate significantly when the proportion of missing value within 50%. It supposes that hot deck imputation appropriates as much as the proportion of missing value is no more than 30-40%, regardless of donor pool size.
|