Abstract:
|
Statistical agencies are committed to protect the confidentiality of survey respondents. Prior to releasing statistical data products, a risk assessment needs to take place to ensure that the disclosure risk is at an acceptably low level. Skinner and Shlomo (2008) developed the log-linear modeling approach to measure the re-identification risk in microdata. In longitudinal surveys, because the same respondents participate in more than one wave of a longitudinal survey, the re-identification risk is usually higher than the risk in cross-sectional data. Even if the individual microdata files for each wave do not contain a common ID variable to identify the same respondents across waves, common variables that do not change over time or change in patterns may allow the users to link up the records in individual files to form longitudinal records. In this paper, we used the Survey of Doctoral Recipients (SDR) public use files as an example to demonstrate the use of log-linear modeling approach to measure the re-identification risk while incorporating the longitudinal nature of the data, which measures the increase of longitudinal risk relative to the cross-sectional risk.
|