Abstract:
|
The DISCRETE edit system (Chen, Winkler, 2000) is based on the Fellegi-Holt (1976) algorithm and identifies records with edit failures as well as solutions involving minimum change of fields. We propose to impute items involved in failed edit by implementing solutions compatible with Fellegi-Holt in the sense that they also involve changing a minimum number of fields. We derive the imputations based on a multivariate correlation model with mixed discrete and continuous variables (Olkin, Tate, 1960). When only discrete items need to be imputed, the item imputation is equivalent to a discriminant analysis. When only continuous items need to be imputed, it is equivalent to a multivariate regression. In addition, the imputation method can be applied to impute joint combination of discrete and continuous items. We will compare our method with a donor-based imputation system and evaluate the two methods relative to each other. We will focus on differences when only few, or no donors are available for donor-based imputation, and when the edit rules disagree with the discriminant analysis. We make recommendation on the use of each imputation method in a range of situations.
|