Online Program Home
My Program

Abstract Details

Activity Number: 289 - Assessing the Quality of Integrated Data
Type: Topic Contributed
Date/Time: Tuesday, July 30, 2019 : 8:30 AM to 10:20 AM
Sponsor: Government Statistics Section
Abstract #304595
Title: Balancing Data Confidentiality and Research Needs: NCHS Linked Mortality Files
Author(s): Lisa Mirel* and Cordell Golden and Cindy Zhang
Companies: CDC/NCHS and CDC/NCHS/OAE/SPB and CDC/NCHS/OAE/SPB
Keywords: confidentiality; synthetic microdata; health surveys; longitudinal studies; mortality
Abstract:

Big data and analytics foster new knowledge. But they challenge producers of sensitive data, who aim to assure confidentiality in publicly accessible data. There can be tradeoffs, such as re-identification risks, when sensitive data are made publically available. One strategy to reduce re-identification is to release synthetic or partially synthetic data. However, these releases could distort the true underlying data. This presentation will discuss analyses assessing the extent to which synthetic data may distort true underlying data. Recently the Data Linkage Program at the National Center for Health Statistics released partially synthetic public-use linked mortality files. To create the public-use version of the restricted-use file, a re-identification risk scenario was conducted to determine records at risk for disclosure. Then values for select records were perturbed. To demonstrate the comparability between the public and restricted-use versions of the linked mortality files, estimated relative hazards for all-cause and cause-specific mortality were calculated. The results reveal key analytical considerations and the importance of such work in the context of data quality.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program