Name: 2020 Joint Statistical Meetings
Start: 2020-08-02T07:00:00+00:00
End: 2020-08-06

Online Program Home
My Program

All Times EDT

Abstract Details

Activity Number:	500 - Statistical Learning
Type:	Contributed
Date/Time:	Thursday, August 6, 2020 : 10:00 AM to 2:00 PM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #313318
Title:	Privacy-Preserving Distributed Learning from Electronic Health Records Across Multiple Heterogenous Clinical Sites
Author(s):	Jiayi Tong* and Chongliang Luo and Rui Duan and Mackenzie Edmondson and Christopher Forrest and Yong Chen
Companies:	University of Pennsylvania and University of Pennsylvania and University of Pennsylvania and University of Pennsylvania and Children's Hospital of Philadelphia and University of Pennsylvania
Keywords:	distributed computing; heterogeneity; multi-site analysis; pairwise conditioning; pseudolikelihood function
Abstract:	Integrating Electronic Health Records (EHR) from multiple centers provides researchers a larger sample size of the population for better estimation and prediction. The challenges in sharing patient-level information in data integration promote the development of distributed algorithms, which only require sharing aggregated information. However, most of the existing distributed algorithms rest on the assumption that data across clinical sites are homogenous. This assumption ignores the heterogeneity in patients’ characteristics, environments, and data collection processes. In this paper, we propose a communication-efficient distributed algorithm. We use the pairwise conditioning approach to construct a pseudolikelihood function to account for the heterogeneous distributions by allowing site-specific unknown nuisance baseline probability function. We evaluate our algorithm through a systematic simulation study motivated by real-world scenarios and apply the algorithm to multiple datasets from the Children’s Hospital of Philadelphia (CHOP). The results show that the proposed method leads to a sensible data sharing scheme for EHRs across different clinical sites.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program

JSM 2020 Online Program

Abstract Details

American Statistical Association