Online Program Home
My Program

Abstract Details

Activity Number: 527
Type: Topic Contributed
Date/Time: Wednesday, August 3, 2016 : 10:30 AM to 12:20 PM
Sponsor: Government Statistics Section
Abstract #319189 View Presentation
Title: Controlling Identification Disclosure Risk in Microdata Release Through Unbiased Post- Randomization
Author(s): Cheng Zhang* and Tapan Kumar Nayak and Jiashen You
Companies: and The George Washington University and Department of Transportation
Keywords: statistical disclosure control ; identification risk measure ; unbiased post-randomization ; disclosure control goals ; data utility
Abstract:

Statistical agencies aim to release informative data to the public, but they need to avoid disclosure of respondents' information, which requires more than removing direct identifiers. Usually a perturbed data set is generated from original data using statistical disclosure avoidance techniques and released. However measuring disclosure risk is difficult as it can occur in many different forms and depends on behavior of intruders. As a result the tasks of designing the perturbation mechanism and assessing data utility of the perturbed data are quite challenging. In this paper we propose a novel and rigorous measure of identification disclosure risk and use it to articulate clear and realistic disclosure control goals. Then we present unbiased post-randomization methods for achieving those goals. Specifically, the probability of correct identification of any sample units will not be larger than a pre-chosen value. We also assess the utility of perturbed data and show that the added variance due to our perturbation procedure is trivial comparing to sampling variance. Finally, as an illustrative example we apply our procedure to a public use micro sample released by US Census Bureau.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association