Fitting parametric random effects models in very large data setsLeonard Egede, Charleston VA REAP and MUSC
*Mulugeta Gebregziabher, Charleston VA REAP and MUSC
Gregory E Gilbert, Charleston VA REAP
Kelly Hunt, Charleston VA REAP and MUSC
Patrick Mauldin, Charleston VA REAP and MUSC
Paul J Nietert, MUSC
Keywords: generalized linear mixed model , random effect meta regression, longitudinal data, very large dataset
With the current focus on personalized medicine, patient level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets, it can be difficult to fit REM using commonly available statistical software such as SAS (Pennell and Dunson, 2007). For example, in a study of over 800,000 Veterans followed over 5 years, fitting a generalized linear mixed models (GLMM) using currently available procedures in SAS (e.g. PROC GLIMMIX) was very difficult and same problems exist in Stata and R. Thus, this study proposes and assesses the performance of a meta regression approach and methods based on sampling of the full data. We compared three different approaches: simple random sampling with weighted pseudo-likelihood and simulated 95% CI, VISN-based stratified sampling with weighted pseudo-likelihood and simulated 95% CI and a random effects meta regression (REMR) of VISN level estimates. Our results indicate that REMR provides unbiased and efficient parameter estimates of GLMM when the VISN level estimates are homogenous. The sampling approaches also provide unbiased parameter estimates for linear mixed model, but results for GLMM were biased for the smaller samples.