Saturday, October 21

Sat, Oct 21, 7:30 AM - 8:30 AM
Aventine Ballroom G

Continental Breakfast and Speed Poster 4 sponsored by Bank of America

Efficient Bounds for Best Subset Selection using Mixed Integer Linear Programming (304089)

*Ana Maria Kenney, Pennsylvania State University

Regularization methods such as Lasso regression are widely used due to their ability to perform automated variable selection (by inducing sparsity) and parameter estimation simultaneously. This is especially desirable in high-dimensional scenarios where standard subset selection methods are not computationally feasible. Building upon recent literature, we use a mixed integer linear program (MILP) formulation that directly controls sparsity by bounding the number of features. In particular, we focus on the selection of an appropriate bound – which was not explicitly explored in previous studies. Performing k-fold cross-validation can be computationally expensive even with recent improvements in MILP solvers. We exploit the flexible nature of MILP to implement an inner cross-validation scheme and a simple stopping rule that allow us to choose the bound efficiently. Results across various simulation scenarios show that our approach is computationally viable and can compete with, and also outperform, Lasso regression.

Online Program

Efficient Bounds for Best Subset Selection using Mixed Integer Linear Programming (304089)

American Statistical Association

Share