Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 416 - SLDS CSpeed 7
Type: Contributed
Date/Time: Thursday, August 12, 2021 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #318453
Title: Doubly Robust Feature Selection with Mean and Variance Outlier Detection and Oracle Properties
Author(s): Luca Insolia* and Runze Li and Francesca Chiaromonte and Marco Riani
Companies: Sant’Anna School of Advanced Studies and Pennsylvania State University and Penn State University and University of Parma
Keywords: Mean-shift outliers; Nonconvex penalties; Robust estimation; Variable selection; Variance-inflation outliers
Abstract:

High-dimensional linear regression models are nowadays pervasive in most research domains. We propose a general approach to handle data contaminations that might hinder classical estimators. Specifically, we consider the co-occurrence of mean-shift and variance-inflation outliers, which are modeled as additional fixed and random components, respectively, and evaluated independently. Our proposal performs variable selection while detecting and down-weighting variance-inflation outliers, excluding mean-shift outliers, and retaining non-outlying cases with full weights. Feature selection and mean-shift outlier detection are performed through a robust class of nonconcave penalization methods. Variance-inflation outlier detection is based on the penalization of the restricted posterior mode. The resulting approach satisfies a robust oracle property for feature selection in the presence of data contamination - where the number of features can increase exponentially with the sample size - and detects truly outlying cases with asymptotic probability one. This provides an optimal trade-off between high breakdown point and efficiency. Effective and lean heuristic methods are also presented.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program