Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 430 - Record Linkage and Auxiliary Data Sources
Type: Contributed
Date/Time: Wednesday, August 10, 2022 : 10:30 AM to 12:20 PM
Sponsor: Survey Research Methods Section
Abstract #322950
Title: Data Analysis After Record Linkage: Accounting for Mismatch Error via Mixture Models
Author(s): Zhenbang Wang* and Guoqing Diao and Emanuel Ben-David and Brady Thomas West and Martin Slawski
Companies: George Mason University and George Washington University and United States Census Bureau and Institute for Social Research, University of Michigan-Ann Arbor and George Mason University
Keywords: record linkage; mixture models; pseudo-likelihood; EM algorithm
Abstract:

Data sets obtained from linking multiple files are frequently affected by mismatch error, as a result of non-unique or noisy identifiers used during record linkage. Accounting for such mismatch error in downstream analysis performed on the linked file is critical to ensure valid statistical inference. In this talk, we present a generic framework to enable valid post-linkage inference in the challenging secondary analysis setting in which only the linked file is given. The proposed framework can flexibly incorporate additional information about the underlying record linkage process, and covers a wide selection of statistical models. Specifically, we propose a pseudo-likelihood approach that is based on two-component mixture models whose two components represent specific distributions conditional on a pair of records being a correct match or mismatch, respectively. The computational and statistical properties of the proposed approach will be studied both theoretically and empirically via simulations and in record linkage applications.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program