Abstract #302031

This is the preliminary program for the 2003 Joint Statistical Meetings in San Francisco, California. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 2-5, 2003); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2003 Program page



JSM 2003 Abstract #302031
Activity Number: 342
Type: Contributed
Date/Time: Wednesday, August 6, 2003 : 9:00 AM to 10:50 AM
Sponsor: Section on Government Statistics
Abstract - #302031
Title: Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods
Author(s): Liwu Chang*+
Companies: Naval Research Laboratory
Address: Mail Code 5540, Washington, DC, 20375-0001,
Keywords: data confidentiality ; statistical inference ; information hiding ; data disclosure ; decision tree ; ramification
Abstract:

We present a new approach for protecting sensitive data in a relational table (columns: attributes; rows: records). If sensitive data can be inferred by unauthorized users with nonsensitive data, we have the inference problem. We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and nonsensitive data are the rest attribute values. In general, however, sensitive data may not be associated with one attribute (i.e., the class), but are distributed among many attributes. We present a generalized decision tree (GDT) method for distributed sensitive data. GDT takes in turn each attribute as the class and analyze the corresponding classification error. Attribute values that maximize an integrated error measure are selected for modification. Our analysis shows that modified attribute values can be restored and hence, sensitive data are not securely protected. This result implies that modified values must themselves be subjected to protection. We present methods for this ramified protection problem and also discuss other statistical attacks.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2003 program

JSM 2003 For information, contact meetings@amstat.org or phone (703) 684-1221. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2003