JSM 2004 - Toronto

Abstract #301375

This is the preliminary program for the 2004 Joint Statistical Meetings in Toronto, Canada. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2004); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2004 Program page



Activity Number: 195
Type: Contributed
Date/Time: Tuesday, August 10, 2004 : 8:30 AM to 10:20 AM
Sponsor: Section on Survey Research Methods
Abstract - #301375
Title: Identification and Impact of Faked Interviews in Surveys--An Analysis of Genuine Fakes by Means of Benford's Law and Robust Machine Learning Approach for Outlier Detection
Author(s): Gert G. Wagner*+ and Joerg-Peter Schraepler and Christin Schaefer and Klaus-Robert Mueller
Companies: DIW Berlin and Ruhr-University Bochum and Fraunhofer FIRST.IDA and Frauenhofer FIRST.IDA and University of Potsdam
Address: Koenigin Luise Strasse 5, Berlin, International, 12163, Germany
Keywords: Benford's Law ; faked interviews ; robust machine learning ; genuine fakes ; outlier detection ; SOEP
Abstract:

Panel data provide a unique opportunity to identify data which are actually faked by interviewers. By comparing data of two waves, unequivocal fakes are easily identifiable. We use the fakes which we know t in the raw data of the German Socio-Economic Panel Study (SOEP) for an analysis of the potential impact of nondetected fakes on survey results. Because in most surveys there is no second wave as they are of a purely cross-sectional nature, we searched for methods which do not need two waves of data. We test (1) an unconventional benchmark called Benford's Law, which is used by numerous accountants to discover frauds, and we apply (2) a robust machine learning approach for outlier detection utilizing a resampling technique. Through a combination of both methods we can identify the majority of interviews (0.5% of all interviews of SOEP) which are unequivocally faked. However, the major result is that the faked and fraudulent records have almost no impact on the mean and the proportions of substantial results. Finally, one should note that, except for some fakes in the first two waves of sample E, faked data were never disseminated within the widely used SOEP.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2004 program

JSM 2004 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2004