Abstract:
|
A new approach to feature selection, which is made possible by improved Mixed Integer Optimization algorithms, is gaining increasing attention in the statistical community. Despite the appeal of this approach, the computational viability on problems of both realistic size (number of features) and complexity (e.g., patterns in the signals, linear associations among features - and in particular across active and non-active ones) is under discussion. We believe that, as proposed, the new approach fails to exploit several facets which could further reduce computational burden and improve performance. This is critical, especially in light of the fact that this approach, like most feature selection techniques, requires data-driven selection of a core tuning parameter. Working within the framework of Mixed Integer Optimization, we put forth simple and effective proposals for improvement, which render tuning computationally viable. Through a carefully designed simulation study and a real data application, we provide successful comparisons with established methods, highlighting pros, cons and avenues for overcoming present limitations.
|