Abstract:
|
Missing data are ubiquitous and present analytical challenges in distributed health data networks that leverage electronic health records (EHRs) from multiple institutions/sites, e.g., pSCANNER and PEDsnet which are partner networks in PCORnet. The existing methods for handling missing data require pooling patient-level data into a centralized repository and hence sharing of such data across institutions/sites. This approach, however, may not be appropriate or practical due to institutional policies (e.g., Veterans Health Administration policies for EHRs require them to be analyzed at VAs facilities), cost of moving large data, and most importantly, privacy concerns. In this talk, I will first describe the issue of missing data in distributed health data networks and then present our work on developing privacy-preserving statistical methods such as multiple imputation for handling missing data in distributed health data networks that do not require pooling patient-level data into a centralized repository. The proposed methods are evaluated in simulation studies and data examples.
|