Abstract:
|
Background: Both probabilistic and deterministic data linkages have been utilized in linking cancer registry data with Heath administrative claims data. However, these approaches have rarely been compared. We examined the characteristics of both approaches and calculated sensitivity and specificity for various scenarios. Methods: Many-to-Many and One-to-Many linkage were performed and manual review were conducted. A final set of true matches was identified and then used to calculate related statistics for both the probabilistic and the deterministic approach. Simulations were conducted to examine how missing SSN will impact the results. Results: Little difference were found in true matches between Many-to-Many matching and One-to-Many matching. Deterministic matching provided similar results. When percentages of SSN increase, quality of linkage for both approaches decreased, especially for the deterministic approach. Discussion: Since it is challenging to acquire permissions to conduct the manual review process, deterministic or combination of both approaches without manual review may be most useful.
|