Abstract:
|
Competing risks data sometimes arise in the clinical setting when the primary event of interest competes with one or possibly several other events. The goal is to model the time to the primary event of interest, for example, death due to a specific cause, using available predictors. In gene expression studies, the number of genes often far exceeds the number of subjects, thus it is challenging to select a parsimonious set of features that predicts the outcome. Here, we propose a variable selection method based on the proportional subdistribution hazards model that maximizes the log-partial likelihood function, coupled with a non-convex penalty function. Using simulation studies, we show that this method works well in high-dimensional settings and generally selects the important predictors and few unimportant predictors. We demonstrate this method when modeling time-to-relapse for acute myeloid leukemia patients who have achieved complete remission using demographic, clinical, and genomic features.
|