Activity Number:
|
91
- High Dimensional Data, Causal Inference, Biostats Education, and More
|
Type:
|
Contributed
|
Date/Time:
|
Monday, August 9, 2021 : 10:00 AM to 11:50 AM
|
Sponsor:
|
ENAR
|
Abstract #318687
|
|
Title:
|
Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Health Care Data
|
Author(s):
|
Chongliang Luo* and Md. Nazmul Islam and Natalie E. Sheils and Rui Duan and Jenna Reps and Jiayi Tong and Mackenzie Edmonson and Martijn Schuemie and John Buresh and Yong Chen
|
Companies:
|
University of Pennsylvania and UnitedHealth Group and UnitedHealth Group and Harvard University and Janssen Research and Development LLC and University of Pennsylvania and University of Pennsylvania and Janssen R&D and UnitedHealth Group and University of Pennsylvania
|
Keywords:
|
communication-efficient algorithm;
distributed research network;
federated learning;
non-iterative;
COVID-19 hospitalization
|
Abstract:
|
Linear mixed models (LMMs) are commonly used in many areas including epidemiology for analyzing multi-site data with heterogeneous site-specific random effects. However, due to the regulation of protecting patients’ privacy, sensitive individual patient data (IPD) are usually not allowed to be shared across sites. In this paper we propose a novel algorithm for distributed linear mixed models (DLMMs). Our proposed DLMM algorithm can achieve exactly the same results as if we had pooled IPD from all sites, hence the lossless property. The DLMM algorithm requires each site to contribute some aggregated data (AD) in only one iteration. We apply the proposed DLMM algorithm to analyze the association of length of stay of COVID-19 hospitalization with demographic and clinical characteristics using the administrative claims database from the UnitedHealth Group Clinical Research Database.
|
Authors who are presenting talks have a * after their name.