Abstract:
|
If there is x% error in individual source files and x% error in the linkage across files, then there can be as much as 3x% error in the analysis of the data from two files that are linked with non-unique common identifiers such as name, address, and date-of-birth. This paper describes methods of cleaning (improving quality) in individuals files, methods for increasing the accuracy of the linkages, and a model for adjusting statistical analyses for the linkage error.
|