JSM 2012 Home

JSM 2012 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Online Program Home

Abstract Details

Activity Number: 350
Type: Contributed
Date/Time: Tuesday, July 31, 2012 : 10:30 AM to 12:20 PM
Sponsor: Section on Survey Research Methods
Abstract - #305140
Title: Estimating Record Linkage Error Rates Without Training Data
Author(s): William Winkler*+
Companies:
Address: 7705 Heritage Drive, Annandale, VA, 22003, United States
Keywords: statistical mixture model ; latent class ; capture-recapture
Abstract:

This paper provides methods for estimating false match rates and false nonmatch rates for record linkage. The estimation of false match rates mimics and extends ideas from semi-supervised learning (Larsen and Rubin 2001, Winkler 2002) using an EMH algorithm (Winkler 1993, 1990) that extends the MCECM algorithm of Meng and Rubin (1993). The estimates of false nonmatch rates use capture-recapture ideas from Winkler (2004) in the situations where unique identifiers are available. It then extends the capture-recapture methods using the ideas of subspace projections and resultant variations due to Haberman (1974) when unique identifiers are not available. The estimation methods are verified empirically using high quality truth decks for which true matching status is known.


The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2012 program




2012 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.