Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 326 - Recent Developments in Probabilistic Record Linkage, Multiple Systems Estimation, and Entity Resolution
Type: Invited
Date/Time: Thursday, August 12, 2021 : 10:00 AM to 11:50 AM
Sponsor: Survey Research Methods Section
Abstract #314486
Title: High-Dimensional, Robust, Unsupervised Record Linkage
Author(s): Ansu Chatterjee* and Sabyasachi Bera
Companies: University of Minnesota and University of Minnesota
Keywords: Record Linkage; High Dimensional; Unsupervised; Robust
Abstract:

We develop a technique for record linkage on high dimensional data, where the two datasets may not have any common variable, and there may be no training set available. Our methodology is based on sparse, high dimensional principal components. Since large and high dimensional datasets are often prone to outliers and aberrant observations, we propose a technique for estimating robust, high dimensional principal components. We present theoretical results validating the robust, high dimensional principal component estimation steps, and justifying their use for record linkage. Some numeric results and remarks are also presented.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program