Online Program

Return to main conference page

All Times ET

Program is Subject to Change

Wednesday, June 16
Wed, Jun 16, 1:30 PM - 3:30 PM
Statistical Confidentiality and Establishment Surveys: Challenges and Solutions to Improving Data Utility

A Proposed 'Bottom-Up' Differential Privacy Approach for Disclosure Prevention in Data Query Tool (308013)

Tom Krenzke, Westat 
*Jianzhu Li, Westat 

Keywords: disclosure risk, real-time system, hypercube, perturbation error

The Occupational Requirements Survey (ORS), conducted by the Bureau of Labor Statistics (BLS) under contract to the Social Security Administration (SSA), collects data on the requirements of work at a detailed occupation level for the overall U.S. civilian economy. BLS and SSA are interested in developing a real-time query system to provide summary tables for the users and researchers. Although ORS is not an establishment survey, the survey data are subject to the risk of disclosing the identifies of participating establishments if they have some almost unique occupations or their employees consist of a dominantly large proportion of an occupation. A “bottom-up” differential privacy approach was proposed to reduce the risk associated with the published tables from the ORS query tool. A hypercube would be created by cross-tabulating occupation and all work requirements variables, for which noise will be generated from a differentially-private algorithm, adjusted by average quote weight, and added to the records in the hypercube. The hypercube serves as the input data to the query tool. The tables requested by the users will be created by aggregating corresponding records in the hypercube. The hypercube can also be calibrated to the control totals derived from the original ORS data (or with small amount of noise added) at high aggregation levels to reduce the variance of the aggregated noise. For variance estimation, a formula is provided to accommodate both sampling error and perturbation error.