Online Program
Towards Unrestricted Public Use Business Microdata: Construction of The Synthetic Longitudinal Business DatabaseJohn Abowd, Cornell UniversityRon S Jarmin, US Census Bureau *Satkartar K Kinney, National Institute of Statistical Sciences Javier Miranda, US Census Bureau Jerry P Reiter, Duke University Arnold Reznek, US Census Bureau Keywords: Synthetic data, longitudinal, business register, administrative data, confidentiality protection, imputation Longitudinal business data are widely desired by researchers, but difficult to make available to the public because of confidentiality constraints. In this paper, we discuss the generation of synthetic public use datasets for establishment data. The basic idea is to release simulated values of sensitive variables, generated from probability distributions fit using genuine data. This can protect confidentiality, since attributes are synthetic rather than real. And, when the models describe the data well, broad-scale inferences from the synthetic datasets will be inferentially valid. We discuss the approaches used for generating synthetic public-use files for the U. S. Census Longitudinal Business Database.
|