Abstract:
|
NIH requires timely sharing of scientific data resulting from all NIH-funded or conducted research. Preparing data for sharing is an iterative process of assessing and mitigating identification risks. In order for the shared database to be user-friendly and proficiently utilized, detailed documentation on the deidentification process is crucial. We will discuss a successfully implemented strategy that streamlined this process by (a) developing an efficient method to maintain documentation, and (b) creating reusable code (SAS macros) for determining potentially identifiable data, and performing and documenting deidentification. Documentation produced with reusable code includes a comprehensive data dictionary for users with deidentification details at the variable level as well as a comparative statistical summary of the original and public data. These SAS macros substantially reduced the amount of labor required to deidentify and document data for sharing. Furthermore, having a streamlined process and reusable code is appealing when requesting NIH funding.
|