JSM 2014 Home
Online Program Home
My Program

Abstract Details

Activity Number: 83
Type: Contributed
Date/Time: Sunday, August 3, 2014 : 4:00 PM to 5:50 PM
Sponsor: Survey Research Methods Section
Abstract #311680 View Presentation
Title: Quality and Analysis of Sets of National Files
Author(s): William Winkler*+
Companies: U.S. Census Bureau
Keywords: record linkage ; edit/imputation ; models ; algorithms ; generalized software
Abstract:

The goal of various clean-up methods is to improve the quality of files to make them suitable for economic and statistical analyses. To fill-in missing data and 'correct' fields, we need generalized software that implements the Fellegi-Holt model (JASA 1976) to preserve joint distributions and assure that records satisfy edits. To identify/correct duplicates within and across files, we need generalized software that implements the Fellegi-Sunter model (JASA 1969). The goal of the clean-up procedures is to reduce the error in files to at most 1% (not currently attainable in many situations). In this presentation, we cover methods of modeling/edit/imputation and record linkage that naturally morph into methods of adjusting statistical analyses in files to linkage error. The modeling/edit/imputation software has four algorithms that may be each 100 times as fast as algorithms in commercial or experimental university software. The record linkage software used in the 2010 Decennial Census matches 10^17 pairs (300 million x 300 million) in 30 hours using 40 cpus on an SGI Linux machine. It is 50 times as recent parallel software from Stanford (Kawai et al. 2006) and 500 times as fast as software used in some statistical agencies. With skilled individuals and this fast software, a group of national files can be cleaned up and used in preliminary analyses in 3-6 months.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program




2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.