Abstract:
|
We present a comparison of prediction performance of commonly used methods on 30-day all-cause non-elective readmission risk. Approaches including LACE, Stepwise logistic, LASSO logistic, and AdaBoost, are compared with sample sizes of the fitting data varying from 2,500 to 80,000. Our results confirm that LACE has moderate discrimination power with AUC around 0.65-0.66, which can be improved to 0.73-0.74 when additional variables from EMR are considered. When sample size is small (?5000), LASSO is the best; when sample size is large (?20,000), Stepwise method has a slightly lower AUC (0.734) compared to LASSO (0.737) and AdaBoost (0.737). We also show that a large proportion of independent variables might be falsely selected as predictors when using a single method and a single division of fitting/validating data. However, it is possible to identify "true" important predictors using the strategy of repeatedly dividing data into fitting/validating subsets and referring the final model based on summarizing results. Our comparison strategy has utility beyond readmission risk prediction and is applicable for other types of predictive models in clinical studies.
|