Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 339 - Individual Treatment Rule and Precision Medicine
Type: Contributed
Date/Time: Wednesday, August 5, 2020 : 10:00 AM to 2:00 PM
Sponsor: Biometrics Section
Abstract #313548
Title: Distributed Learning from EHRs Across Multiple Sites for Zero-Inflated Count Outcomes, with Application to Understanding Risk Factors of Avoidable Hospitalization Admission
Author(s): Mackenzie Edmondson* and Chongliang Luo and Rui Duan and Mitchell Maltenfort and Christopher Forrest and Yong Chen
Companies: University of Pennsylvania and University of Pennsylvania and University of Pennsylvania and Children's Hospital of Philadelphia and Children's Hospital of Philadelphia and University of Pennsylvania
Keywords: distributed learning; EHR; hurdle; zero-inflation; privacy-preserving; communication-efficient
Abstract:

EHR data are widely used in modern healthcare research, containing useful information characterizing patients' clinical visits. Due to privacy concerns surrounding patient-level data sharing, most clinical data analyses are performed at individual sites. This leads to underpowered studies specific to a certain population, creating a need for methods which perform analyses across sites without sharing patient-level data. To address this, distributed algorithms have been developed to conduct analyses across sites by sharing only aggregated information, preserving patient privacy. We propose a communication-efficient distributed algorithm for performing hurdle regression on data stored in multiple sites. By modeling zero and positive counts separately, we account for zero-inflation in the outcome, which is common in characterizing patient hospitalization frequency. Our simulations show that our algorithm achieves high accuracy comparable to the oracle estimator using all patient-level data pooled together. We apply our algorithm to data from the Children's Hospital of Philadelphia to estimate how often a patient is likely to be hospitalized given data collected during clinical visits.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program