This is the program for the 2010 Joint Statistical Meetings in Vancouver, British Columbia.
Abstract Details
Activity Number:
|
314
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, August 3, 2010 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Section on Survey Research Methods
|
Abstract - #307481 |
Title:
|
Expected Number of Random Duplications Within or Between Lists
|
Author(s):
|
William E. Yancey*+
|
Companies:
|
U.S. Census Bureau
|
Address:
|
4600 Silver Hill Road, Suitland, MD, 20746,
|
Keywords:
|
record linkage ;
deduplication ;
Stirling numbers
|
Abstract:
|
The U.S. Census seeks to determine duplicately listed individuals by searching across lists to identify records with the same name and birth date. The question arises of how many of these agreeing records are random agreements, two different people with the same name and birth date. To formally answer this question, we consider first the familiar Birthday Problem and then the more complicated Collision Problem. For each of these problems we exhibit the explicit probability distributions from which we can compute means and variances for some parameter values. We apply this result to voter registration lists for Oregon and Washington to estimate the number of "false matches" occur across these lists.
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2010 program
|
2010 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.