The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Online Program Home
Abstract Details
Activity Number:
|
350
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, July 31, 2012 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Section on Survey Research Methods
|
Abstract - #305140 |
Title:
|
Estimating Record Linkage Error Rates Without Training Data
|
Author(s):
|
William Winkler*+
|
Companies:
|
|
Address:
|
7705 Heritage Drive, Annandale, VA, 22003, United States
|
Keywords:
|
statistical mixture model ;
latent class ;
capture-recapture
|
Abstract:
|
This paper provides methods for estimating false match rates and false nonmatch rates for record linkage. The estimation of false match rates mimics and extends ideas from semi-supervised learning (Larsen and Rubin 2001, Winkler 2002) using an EMH algorithm (Winkler 1993, 1990) that extends the MCECM algorithm of Meng and Rubin (1993). The estimates of false nonmatch rates use capture-recapture ideas from Winkler (2004) in the situations where unique identifiers are available. It then extends the capture-recapture methods using the ideas of subspace projections and resultant variations due to Haberman (1974) when unique identifiers are not available. The estimation methods are verified empirically using high quality truth decks for which true matching status is known.
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2012 program
|
2012 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.