Abstract #301974


The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2002 Program page



JSM 2002 Abstract #301974
Activity Number: 306
Type: Topic Contributed
Date/Time: Wednesday, August 14, 2002 : 10:30 AM to 12:20 PM
Sponsor: Section on Government Statistics*
Abstract - #301974
Title: Assessing Disclosure Protection for A SOI Public Use File
Author(s): Marianne Winglee*+ and Peter Sailer and Michael Weber and Richard Valliant and Jay Clark and Yunhee Lim
Affiliation(s): Westat, Inc. and Internal Revenue Service and Internal Revenue Service and Westat, Inc. and Westat, Inc. and Westat, Inc.
Address: 1650 Research Blvd., Rockville, MD, 20850, USA
Keywords: subsampling ; Microdata ; disclosure risk ; information loss ; microaggregation clusters ; record linkage
Abstract:

The Statistics of Income (SOI) program of the Internal Revenue Service (IRS) is mandated to provide data to support analyses of the tax system in the United States. For this purpose, SOI releases a data file each year that contains microdata from a representative sample of individual tax returns. A public-use version of this tax model file is released after disclosure control procedures to avoid potential recognition of individual taxpayers. Disclosure control techniques applied to the tax model file for public-use include suppression, global and local recoding, top coding, rounding, subsampling, and microaggregation (replacing reported data by the average value of a sorted cluster of three values). To address the recent concerns of information explosion, SOI has sponsored several evaluation studies to determine the availability of external data and the risk of data linkages. This paper discusses methods to measure the likelihood of correct matches for rare income returns and examines options to refine the subsampling method and the microaggregation clusters to enhance proper disclosure protection and to preserve the analytic value of the data.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2002 program

JSM 2002

For information, contact meetings@amstat.org or phone (703) 684-1221.

If you have questions about the Continuing Education program, please contact the Education Department.

Revised March 2002