Abstract:
|
During the release of Consumer Expenditure (CE) Survey data to the public, masking sensitive and identifiable information is required by law to protect the household's confidentiality. A statistical disclosure limitation (SDL) process known as "top-coding" is implemented for this purpose. For instance, in the publically released microdata, high values of household property tax are replaced by the average of all high household property tax values. Top-coding can have numerical impacts on the utility and quality of the microdata, especially for analyses that are sensitive to the tails of the distribution. For example, the bias in the cumulative distribution function (CDF) of a top-coded expenditure can induce inaccurate point estimates and variation. In this study, we investigate what effect top-coding will have on the empirical CDF of certain expenditures in terms of the relationship between expenditures and income after adjusting demographics. We implement a data utility measurement based on the empirical CDF to assess the consequences of top-coding on the utility of the CE microdata.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.