Abstract:
|
Variable selection is fundamental in high-dimensional problems. Various variable selection methods have been developed recently, for example, forward stepwise regression, least angle regression, and many more. These methods have a sequential nature in selection as variables are added into the model one-by-one. For such procedures, it is crucial to find a stopping criterion. One of the most commonly used technique in practice is cross-validation (CV). However, CV has huge computational cost and is lack of statistical interpretation. To overcome these drawbacks, we introduce a flexible and efficient testing-based variable selection approach that could be incorporated with any sequential selection procedures. At each step of the selection, we test the overall signal in the remaining inactive variables using the maximal absolute partial correlation among the inactive variables with the response conditionally on active variables. Furthermore, we develop a stopping criterion using the stepwise $p$-value. Numerical studies show that the proposed method delivers very competitive performance in terms of both variable selection accuracy and computational complexity compared to CV.
|