Abstract Details
Activity Number:
|
83
|
Type:
|
Contributed
|
Date/Time:
|
Sunday, August 3, 2014 : 4:00 PM to 5:50 PM
|
Sponsor:
|
Survey Research Methods Section
|
Abstract #311680
|
View Presentation
|
Title:
|
Quality and Analysis of Sets of National Files
|
Author(s):
|
William Winkler*+
|
Companies:
|
U.S. Census Bureau
|
Keywords:
|
record linkage ;
edit/imputation ;
models ;
algorithms ;
generalized software
|
Abstract:
|
The goal of various clean-up methods is to improve the quality of files to make them suitable for economic and statistical analyses. To fill-in missing data and 'correct' fields, we need generalized software that implements the Fellegi-Holt model (JASA 1976) to preserve joint distributions and assure that records satisfy edits. To identify/correct duplicates within and across files, we need generalized software that implements the Fellegi-Sunter model (JASA 1969). The goal of the clean-up procedures is to reduce the error in files to at most 1% (not currently attainable in many situations). In this presentation, we cover methods of modeling/edit/imputation and record linkage that naturally morph into methods of adjusting statistical analyses in files to linkage error. The modeling/edit/imputation software has four algorithms that may be each 100 times as fast as algorithms in commercial or experimental university software. The record linkage software used in the 2010 Decennial Census matches 10^17 pairs (300 million x 300 million) in 30 hours using 40 cpus on an SGI Linux machine. It is 50 times as recent parallel software from Stanford (Kawai et al. 2006) and 500 times as fast as software used in some statistical agencies. With skilled individuals and this fast software, a group of national files can be cleaned up and used in preliminary analyses in 3-6 months.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2014 program
|
2014 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Professional Development program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.