Abstract:
|
In this talk we discuss the use of generalized cross validation (GCV) as a sort of information criterion in the selection of sparse parameters in a high-dimensional context. The main contribution are two results that support the application of GCV. The first result links GCV to Mallows's Cp criterion, and through Mallows's Cp, it links GCV to the prediction error as unobservable objective function. From the link between GCV and Mallows's Cp it follows that GCV inherits the issues that are linked to applying Mallows's Cp in the context of sparse variable selection in high-dimensional parameters. These problems are linked to the prominent effect of false positives in a sparse variable selection, especially when the false positives are not tempered by some sort shrinkage, by using, for instance, lasso. The second result needed for the application of GCV is about its behavior for models close to the full (high-dimensional) model. This result prevents the full model from being the absolute optimum of the GCV curve, thereby disturbing the optimization in practice.
|