Abstract:
|
K-fold cross-validation is widely adopted as a model selection criterion. In K-fold cross-validation, K-1 folds are used for model construction and the hold-out fold is allocated to model validation. This implies model construction is more emphasized than the model validation procedure. However, some studies show that more emphasis on the validation procedure can result in improved model selection. Specifically, leave-m-out cross-validation with n samples can achieve variable-selection consistency when m=n approaches to 1. In this work, a new cross-validation method is proposed under the framework of K-fold cross-validation. The proposed method uses K-1 folds of the data for model validation, while the other fold is for model construction. This provides K-1 predicted values for each observation. These values are averaged to produce a final predicted value. Then, the model selection based on the averaged predicted values can reduce variation in the assessment due to the averaging. The variable-selection consistency of the suggested method is established. Its advantage over K-fold cross-validation with finite samples are examined under linear, nonlinear, and high dimensional models.
|