Online Program

Return to main conference page
Friday, February 16
CS04 Working with Messy Data Fri, Feb 16, 9:15 AM - 10:45 AM
Salon E

Doing Data Linkage: A Behind-the-Scenes Look (303529)

View Presentation View Presentation

Lisa B. Mirel, National Center for Health Statistics, CDC 
*Clinton J. Thompson, National Center for Health Statistics, CDC 

Keywords: mortality linkage, NCHS, NDI

Linking survey and vital or administrative records is complicated. The steps outlined in the record linkage literature include cleaning and standardizing the input data sources, determining match weights and score cutoffs, and finalizing a file. The practice of data linkage, however, is messy and non-linear and includes tasks not prominently featured in the literature: managing large databases (without Big Data tools), comparing analysis results between current and past linkages, and communicating summary data to a diverse team. Using the NCHS surveys linked to the National Death Index (NDI) as an example, this talk will discuss these practical aspects of data linkage from the perspective of one practitioner to another. We will explore the necessity to use multiple statistical programs, the use of visual dashboards for concisely presenting results to a team, and discuss how all of these aspects combined are part of the process to create a final linked file. We will conclude with possible enhancements moving forward.