Abstract:
|
In traditional applications of differential privacy, all access to the underlying data must be noisily perturbed by making requests of the true data through a 'privacy filter,' with noise scale carefully defined to limit the influence of any single individual on the distribution of output statistics. However, in real-world applications of differentially private methods, it may sometimes be desired to weaken the privacy guarantee in exchange for ensuring that some output statistics are 'invariant,' or unperturbed by noise. Imposing complex sets of invariants of this kind creates two problems: first, the privacy guarantee must be modified appropriately. Second, real-world surveys have complex structure, user communities often expect microdata, and generating microdata that satisfies a complex set of invariants at scale is a non-trivial task. Using development of the 2020 Decennial Census's Disclosure Avoidance System as an example, we discuss the second of these two problems: how to use polyhedral geometry to ensure a valid microdata set will always be generated in the face of complex invariants, at large scale, and the implications of this approach for data utility.
|