Abstract:
|
Many have sought to identify genomic features associated with a time-to-event outcome, and variable selection strategies are desired for searching such high-dimensional spaces. Medical breakthroughs in recent years have led to cures for various diseases including acute myeloid leukemia (AML). Mixture cure models can be used when a cured fraction exists in the sample studied. Unfortunately, currently few variable selection methods exist for mixture cure models when there are more predictors than samples. This study develops penalized Weibull mixture cure models for high-dimensional datasets where estimation proceeds using either the Generalized Monotone Incremental Forward Stagewise algorithm or the Expectation–Maximization algorithm. The resulting model allows for identification of prognostic factors associated with both long-term (cure) and/or time-to-event (latency) outcomes. We compared the performance of different algorithms regarding false positive fraction, power, and prediction using extensive simulation studies. A real application on gene expression data for AML was conducted to further evaluate the model performance.
|