Abstract:
|
In high-dimensional estimation, analysts are faced with more parameters p than available observations n, and asymptotic analysis of performance allows the ratio p/n ? ?. This situation makes regularization necessary desirable for estimators to possess theoretical guarantees. However, the amount of regularization, is integral to achieving good performance. In practice, choosing the tuning parameter is done through resampling methods, information criteria, or by reformulating the optimization problem, each with some theoretical guarantee for the low- or high-dimensional regimes. However, there are some notable deficiencies in the literature. The theory, and sometimes practice, of many methods relies on knowledge of, or an estimate for, the noise variance. In this paper, we (1) provide theoretical intuition suggesting that some previous approaches based on information criteria work poorly in high dimensions, (2) introduce new risk estimators and (3), we compare our proposal to many existing methods for choosing the tuning parameters for lasso. We find that our new estimators are often better than the existing approaches across a wide range of conditions and evaluation criteria.
|