Abstract:
|
When dealing with a large number of candidate variables, variable selection is often necessary to establish a parsimonious model for prediction. We address two potential complications which arise in time-dependent Cox models to predict time to event. First, the candidate variables may change over time. Second, there may be an inherent grouping structure (GS) or strong/weak heredity among these variables. For example, 1) the several binary indicators representing a single categorical variable should be collectively included or excluded, 2) interaction selection is dependent on the selection of the main terms. Our proposed technique uses sparsity-inducing penalties on groups of variables, permitting user-specified GS incorporating a prior knowledge about the relationships among candidate variables and their interactions. Such dependence can also be nested, which calls for multiple penalty terms. Optimization is performed using a hybrid of accelerated proximal gradient descent with block-wise coordinate descent to improve efficiency. We further apply our method to a large electronic health records dataset from Quebec, Canada.
|