Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 287 - Contributed Poster Presentations: Government Statistics Section
Type: Contributed
Date/Time: Tuesday, August 9, 2022 : 10:30 AM to 12:20 PM
Sponsor: Government Statistics Section
Abstract #322129
Title: Linking Records and Imputing Missing Values for Multi-Source Data
Author(s): Mary Munro* and Hongxun Qin and Dr. Timothy Champney and Angus Chen and Sam Cohen and Yueh Quach and Dr. Yolande Tra
Companies: The MITRE Corporation and The MITRE Corporation and The MITRE Corporation and The MITRE Corporation and The MITRE Corporation and The MITRE Corporation and The MITRE Corporation
Keywords: Record link; Missing value; Imputation; Geodistance; Statistical matching; Fuzzy matching
Abstract:

Missing values have been a challenging problem. Variables with missing values may not be independent with each other and imputing them independently would distort the integrity of the data. We will explore sequential imputation methods and estimate the computational cost of the methods for large datasets. We will also conduct tests filling missing values for variables from one dataset by using variables from other datasets. Research in most disciplines now requires joining data sets from multiple sources. Even the data from the same source may range over many years and data elements may vary and lack proper identifiers for direct matching. Many techniques for probabilistic linking data are available, but they require more resources for large data sets. For instance, matching addresses by using text match probability estimation is computationally intensive for large data. In this research, we combine numerical methods using latitude and longitude with parsed address parts to demonstrate and test alternative inexact matching approaches to impute missing values from secondary data sources.

Approved for Public Release; Distribution Unlimited. Public Release Case Number 22-1164


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program