Abstract:
|
Strong correlations among features are well-known hurdles for existing variable selection/screening methods. Previous studies demonstrated that transforming predictors through a pre-processing step called ZCA whitening can greatly improve accuracy in certain selection procedures. However, this whitening method induces complete de-correlation at the cost of similarity with the original set of predictors and thus, interpretability. We propose a more general technique that allows one to leave a small, harmless level of collinearity in order to strengthen the mapping between original and transformed variables through the use of semidefinite programming. We demonstrate the benefits and drawbacks of this method along with other decorrelation procedures when applied prior to selection techniques through an in-depth simulation study and real data application.
|