Abstract:
|
The usual way to protect intruders from identifying an individual's data in a table is to suppress all small cells (primary suppression) and secondary (complementary) cells that must be suppressed to protect the primary suppression. An organization typically uses fixed suppression cut offs for all cells, typical between 3 and 10 or higher, for a particular data product. While this technique is sufficient to de-identify a vulnerable cell, often it might result is over suppression. In this paper we explore the use of scores to rate a cell for potential de-identification that are currently in use in some State health departments. The scores we will illustrate in this paper are based on demographic characteristics such as gender, race, time- reporting period (e.g. weekly, monthly, quarterly, etc.), residence geography, etc. We will illustrate how applying the scores to different tables can identify cells with disclosure risk and result in different outcomes for the very same small cell when put into context of other available data. The rationale behind the scores will also be discussed.
|