Abstract:
|
Federated learning is an appealing framework for analyzing sensitive data from distributed health data networks due to its protection of data privacy. Under this framework, data partners at individual sites collaboratively build an analytical model under the orchestration of a coordinating site, while keeping the data decentralized. However, existing federated learning methods mostly assume data across sites are homogeneous samples of the global population, hence failing to properly account for the extra variability across sites in estimation and inference. We will introduce a top-down partitioning approach and a bottom-up fusion approach for estimating parameters of interest, both in the federated learning framework. Both methods consider data coming from heterogeneous sources, and appropriately borrows information across the data network for estimation and inference. An application on multi-hospital electronic health records network will also be discussed.
|