Abstract:
|
There are two challenges facing medical researchers who use linear mixed models (LMM) to analyze electronic medical records (EMR) data. The first one is the ever increasing size of EMR data sets. For example, the UCLA Health System hospitals and clinics have over 2.5 million annual patient visits. Statistical inference of LMM on such large scale EMR data is computationally very expensive. The second challenge is data sharing and patient privacy. For studying rare diseases, medical researchers want to pool EMR data from multiple health systems to improve the power of detecting signals. However, based on patient privacy concerns and other reasons, health systems rarely share data with people outside their organizations, and thus making it impossible to pool EMR data. To solve these challenges, we developed a Julia software package for making statistical inference on massive and distributed linear mixed models using the Bag of Little Boostraps (BLB) method. We demonstrate the statistical and computational performance of our method on real and synthetic data.
|