|Friday, February 16|
|CS04 Working with Messy Data||
Fri, Feb 16, 9:15 AM - 10:45 AM
Doing Data Linkage: A Behind-the-Scenes Look (303529)
*Clinton J. Thompson, National Center for Health Statistics, CDC
Keywords: mortality linkage, NCHS, NDI
Linking survey and vital or administrative records is complicated. The steps outlined in the record linkage literature include cleaning and standardizing the input data sources, determining match weights and score cutoffs, and finalizing a file. The practice of data linkage, however, is messy and non-linear and includes tasks not prominently featured in the literature: managing large databases (without Big Data tools), comparing analysis results between current and past linkages, and communicating summary data to a diverse team. Using the NCHS surveys linked to the National Death Index (NDI) as an example, this talk will discuss these practical aspects of data linkage from the perspective of one practitioner to another. We will explore the necessity to use multiple statistical programs, the use of visual dashboards for concisely presenting results to a team, and discuss how all of these aspects combined are part of the process to create a final linked file. We will conclude with possible enhancements moving forward.