Activity Number:
|
285
- Probabilistic Record Linkage and Inference with Merged Data
|
Type:
|
Topic Contributed
|
Date/Time:
|
Tuesday, July 30, 2019 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Section on Statistics in Epidemiology
|
Abstract #304814
|
Presentation 1
Presentation 2
|
Title:
|
A Structured Prior for Sequential Bayesian Record Linkage
|
Author(s):
|
Brendan McVeigh* and Jared S Murray
|
Companies:
|
Carnegie Mellon University and University of Texas at Austin
|
Keywords:
|
Record Linkage;
Unsupervised learning;
MCMC
|
Abstract:
|
Probabilistic record linkage is the problem of identifying sets of records from multiple databases which correspond to the same underlying entity in the absence of a unique identifier. For all but the smallest problems computational considerations mean that only a small subset of the possible record pairs can be considered for matching. In principle a multistage approach to this problem could deliver substantial gains in computational efficiency. Such an approach first considers a small number of candidate matches for each record, and only considers a larger number of candidates for records which remain unmatched after the first stage. We present a new record linkage prior and latent variable model which capture such a multistage approach. By fully incorporating the multistage approach into our statistical model we allow for valid posterior inference despite the multistage nature of the matching.
|
Authors who are presenting talks have a * after their name.