Online Program Home
  My Program

Abstract Details

Activity Number: 519 - Sparse Statistical Learning
Type: Contributed
Date/Time: Wednesday, August 2, 2017 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #324742 View Presentation
Title: Efficient Bounds for Best Subset Selection Using Mixed Integer Linear Programming
Author(s): Ana Kenney* and Francesca Chiaromonte and Giovanni Felici
Companies: and Penn State University and IIASI CNR
Keywords: best subset selection ; mixed integer linear programming ; lasso ; least absolute deviation ; cross-validation
Abstract:

Regularization methods such as Lasso regression are widely used due to their ability to perform automated variable selection (by inducing sparsity) and parameter estimation simultaneously. This is especially desirable in high-dimensional scenarios where standard subset selection methods are not computationally feasible. Building upon recent literature, we use a mixed integer linear program (MILP) formulation that directly controls sparsity by bounding the number of features. In particular, we focus on the selection of an appropriate bound - which was not explicitly explored in previous studies. Performing k-fold cross-validation can be computationally expensive even with recent improvements in MILP solvers. We exploit the flexible nature of MILP to implement an inner cross-validation scheme and a simple stopping rule that allow us to choose the bound efficiently. Results across various simulation scenarios show that our approach is computationally viable and can compete with, and also outperform, Lasso regression.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association