Online Program

Simultaneous Variable Selection in Joint Models with Longitudinal and Survival Outcomes

*Zangdong He, Department of Biostatistics, Richard M. Fairbanks School of Public Health, Indiana University 
Wanzhu Tu, Indiana University School of Medicine 
Zhangsheng Yu, Department of Biostatistics, Indiana University School of Medicine  

Keywords: Mixed effect selection, Cholesky decomposition, Gaussian quadrature, ASO, EM algorithm, Penalized likelihood

Joint models with longitudinal and survival outcomes have been used with increasing frequency in clinical investigations. Inference on fixed and random effects is essential for the assessment of independent risk factors and subject-specific effects and the maintenance of valid marginal covariance structures. Simultaneous variable selection in both longitudinal and survival components is therefore the ultimate safeguard against erroneous inference due to model misspecification. Longitudinal and survival joint model is gaining more popularity recently, however, variable selection in joint model settings has not been carefully studied and no existing computational tools have been made available to practitioners. In this research, we propose maximum doubly penalized likelihood (MDPL) method with adaptive least absolute shrinkage and selection operator (ALASSO) penalty functions to simultaneously select mixed effects in the longitudinal and survival joint model. To ensure positive-definiteness of the selected covariance matrix of random effects, Cholesky decomposition is used to reparameterize the covariance matrix of random effects and a penalty function of group shrinkage was introduced. To correct the estimation bias due to penalty, we also proposed a two-stage procedure which drastically reduced the magnitude of the bias. For computation, the penalized likelihood is approximated by Gaussian quadrature method and optimized by expectation-maximization (EM) algorithm. Simulation studies confirmed that small selection bias in the first stage, and a moderate estimate bias. The second-stage estimation substantially reduced the estimation bias. To illustrate, we analyzed a real data set with brain natriuretic peptide (BNP) as the longitudinal outcomes and death as the survival outcomes in patients with heart failure from an electronic medical record database.