Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 165 - SLDS CSpeed 2
Type: Contributed
Date/Time: Tuesday, August 10, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #319062
Title: Risk Estimation in the Normal Means Problem via Auxiliary Randomization
Author(s): Natalia Lombardi de Oliveira* and Ryan Tibshirani and Jing Lei
Companies: Carnegie Mellon University and Carnegie Mellon University and Carnegie Mellon University
Keywords: risk estimation; multivariate normal mean; SURE; prediction error
Abstract:

We propose a new approach for estimating the risk in the classical normal means problem. We construct an exactly unbiased estimator of risk for any estimator without access to test data, and without relying on any kind of resampling or sample-splitting. A crucial aspect of our approach is that it is unbiased for a slightly harder version of the problem, where the noise level in the data is assumed to be elevated. The key idea is to generate from the training data two auxiliary data sets by adding carefully constructed synthetic noise; the estimator is then trained on the first of these data sets and tested on the second. Under some conditions, this approach exactly recovers (as a special case, in the limit as the amount of added noise vanishes) classical methodology in the statistics literature such as Mallow’s Cp estimator, and more generally, Stein’s unbiased risk estimator. Through a bias-variance decomposition of our risk estimator, we quantify the order of the bias and variance as a function of the magnitude of added noise. Finally, we show that simply averaging the estimated risk over multiple replications of adding noise is an effective way of controlling the variance.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program