Online Program

Multivariate selective editing based on the use of administrative data: an application to the Italian business survey on ICT usage and e-commerce
*Orietta Luzi, Italian National Statistical Institute - Istat 

Keywords: errors, nonresponse burden, survey costs, data integration

The paper illustrates the results of an experimental application of the SeleMix (Selective Editing via Mixture models) tool to the Italian annual business survey on ICT usage and e-commerce. SeleMix is an R package developed at the Italian National Statistical Institute implementing a multivariate selective editing approach for continuous data based on the use of contamination models for detecting influential errors (in terms of potential impact on target estimates) on which editing efforts are to be mostly spent. In this approach a score function strictly related to the expected error in data is defined: differently from most selective editing methods, the threshold identifying the subset of influential units can be statistically interpreted and associated to estimates accuracy. For each unit, data predictions are also provided. As the efficiency of the approach also depends on the reliability of the auxiliary information used in models, information on ICT population available from administrative sources and related to ICT target phenomena is exploited in model estimation. Results show a significant reduction of recontacts needed to obtain pre-defined accuracy levels on estimates.