Abstract:
|
Many institutions and researchers require access to detailed microdata. Historically, agencies may selectively allow access to the “real data” under strict security restrictions that respect disclosure avoidance regulations. Some programs produce public use microdata samples (PUMS) i.e. highly sanitized versions of real data. Heavily sanitized data may not be suitable for model-building and analyses. Consequently, many agencies are investigating methods of developing synthetic microdata as an alternative to PUMS, with the intent of developing datasets with high utility for analyses and low risk of disclosure. That said, measures of utility and privacy risk can be highly subjective and will differ by dataset: for example, household data are very different from economic populations. This roundtable will be an open discussion about synthetic data development methods and metrics, focusing on challenges, successes, and innovations along with any specific aspects that interests the participants.
|