Online Program Home
My Program

Abstract Details

Activity Number: 471 - Advances in High-Dimensional Inference and Multiple Testing
Type: Contributed
Date/Time: Wednesday, July 31, 2019 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #304328
Title: Cross Validation Importance Learning
Author(s): Chenglong Ye* and Yuhong Yang
Companies: University of Minnesota and University of Minnesota
Keywords: Variable Importance; Cross Validation

It is well recognized that many machine learning methods perform very well in a variety of applications such as virtual personal assistants and online customer support, and are benefitting people's lives. However, the fact that most machine learning methods do not provide a variable importance measure is usually a barrier that prevents people from interpreting the results. In this talk, we present two types of variable importance measures. Given any specific method, by deleting a variable in the data set or replacing the variable with a constant, CVIL measures the relative difference of the predictive performance of the model from a cross-validation perspective. Under some mild conditions, CVIL is consistent in the sense that it converges to the theoretical variable importance as the sample size grows. Confidence intervals are constructed to show the reliability of the proposed CVIL importance measure. By simulations and real data examples, we show that CVIL provides a rank of variable importance attached to any seemingly uninterpretable predictive algorithm such as random forest.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program