Abstract:
|
This paper develops a new approach to post-selection inference for screening high-dimensional predictors of survival outcomes. Post-selection inference for right-censored outcome data has been investigated, but much remains to be done for reliability and computational-scalability in high-dimensions. Machine learning tools are commonly used to predict survival outcomes, but the estimated effect of a selected predictor suffers from confirmation bias unless the selection is taken into account. The approach involves construction of semiparametrically efficient estimators of the linear association between the predictors and the survival outcome and builds a test statistic to detect the presence of an association between any of the predictors and the outcome. Further, a stabilization technique allows a normal calibration for the test statistic, which enables the construction of confidence intervals for the maximal association between a predictor and the outcome. Theoretical results show the procedure is valid when the number of predictors grows superpolynomially with sample size, and our simulations support this asymptotic guarantee is indicative the finit-sample performance of the test.
|