Online Program Home
  My Program

Abstract Details

Activity Number: 120 - SPEED: Variable Selection and Networks
Type: Contributed
Date/Time: Monday, July 31, 2017 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #323602
Title: Impact of Divergence of Training and Testing Sets on Predictive Risk and Measure of Model Complexity
Author(s): Jieyi Jiang* and Yoonkyung Lee and Steven MacEachern
Companies: Ohio State University and The Ohio State University and The Ohio State University
Keywords: model selection ; parameter tuning ; diffusion on covariates
Abstract:

One goal of model selection is the minimization of predictive risk. Typical derivations for model selection criteria assume that the distributions of covariates in the training set and future-prediction set are identical. In practice, we are often most interested in forecasts over a novel covariate distribution. This difference in intended use of the model impacts predictive risk, estimates of predictive risk, and summaries arising from it.

We formulate the predictive problem with divergence of the future-prediction set from the training set in terms of a diffusion of the distribution of covariates. As the covariates in the training set are diffused, measures of model complexity based on predictive risk change, impacting both model selection and choice of tuning parameters. We provide results in several settings, including subset selection, ridge regression, and kernel regression, and propose ways to adjust the standard measures of model complexity. Simulation studies show the benefits of the adjustment.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association