Online Program

Protecting Confidentiality While Preserving Quality of Establishment Data
*Lawrence Cox, National Institute of Statistical Sciences (NISS) 


Statistical techniques for preserving confidentiality of establishment data date back to the 1940s and beyond. Early data releases were confined to two-way tables in printed form, and data protection amounted to suppressing sensitive ("risky") tabular cells. Subsequently, complementary cell suppression (CCS)-suppressing additional cells to thwart reconstruction of risky cells-evolved. Early CCS methods were ad hoc and performed by hand. Towards the 1960s, first at the US Census Bureau, CCS was automated resulting in faster, but still ad hoc, data protection. During mid 1970s-late 1990s, statistically principled methods for defining and limiting statistical disclosure were developed and implemented, first at the US Census Bureau and Statistics Canada, and later elsewhere. These methods include rounding, perturbation, and complementary cell suppression, each based on a risk model for disclosure defined by one or more linear sensitivity measures. To this point, however, little or no consideration was given to the effects of data protection on data quality and usability. In the early 2000s, a new method, quality-preserving controlled tabular adjustment, was introduced that did incorporate quality concerns. Subsequently, quality effects of the original three statistical disclosure limitation (SDL) methods were investigated and, to a degree, quantified.

This short course will cover the risk model for disclosure in establishment data presented in tabular form and the four SDL methods discussed above, together with an examination of the effectiveness of each method for data protection and its effects on data quality and usability. Other SDL methods such as releasing interval data and perturbing underlying microdata will also be discussed. Future opportunities for SDL such as in statistical data base query systems and establishment-based microdata will be described.

About the instructors:

Lawrence H. Cox serves as Assistant Director for Official Statistics, National Institute of Statistical Sciences, and as an Independent Statistical Consultant. He has held senior methodologist and research director positions at the US Census Bureau, Environmental Protection Agency, and National Center for Health Statistics, and in addition Director, Board on Mathematical Sciences, US National Academy of Sciences. Larry is a Fellow of ASA and Elected Member of ISI, and has published and lectured extensively on SDL and quality-preserving SDL