Keywords: double-divergence, data integration, high-dimensional, personalized modeling, separation-penalty, subgrouping
In this paper we propose a heterogeneous modeling framework which achieves the individual-wise feature selection and the covariate-wise subgrouping simultaneously. In contrast to conventional model selection approaches, the key component of the new approach is to construct a separation penalty with multi-directional shrinkages, which facilitates individualized modeling to distinguish strong signals from noisy ones and selects different relevant variables for different individuals. Meanwhile, the proposed model identifies subgroups among which individuals share similar covariates’ effects, and thus improves individualized estimation efficiency and feature selection accuracy. Moreover, the proposed model also incorporates withinindividual correlation for longitudinal data. We provide a general theoretical foundation under a double-divergence modeling framework where the number of individuals and the number of individual-wise measurements can both diverge, which enables the inference on both an individual level and a population level. In particular, we establish the population-wise oracle property for the individualized estimator to ensure its optimal large sample property under various conditions. Simulation studies and an application to PTSD data are illustrated to compare the new approach to existing variable selection methods.