![IconGems-Print](images/IconGems-Print.png)
254 – Contributed Poster Presentations: Section on Statistical Learning and Data Science
Sound and Solid Selection of Covariates - a Simulation Study
Kira Dynnes Svendsen
Technical University of Denmark
Line Clemmensen
Technical University of Denmark
Lars Kai Hansen
Technical University of Denmark
Bjarne Kjaer Ersbøll
Technical University of Denmark
In this the era of big(ger) data, the all-time relevant question of how to detect the features which are truly relevant for an outcome of interest becomes paramount. As the amount of variables in data increases, the degree to which field knowledge is incorporated in the analysis decreases. As more and more automatized machine learning methods for handling and extracting information from data become easily accessible, an overview of the qualities and potential pitfalls of the contemporary and excessively used methods is pertinent. Here we present the results of a simulation study assessing the performance of different methods for 'blindfolded' or 'field knowledge free' feature selection. We have considered Lasso, Forward selection, Elastic Net, Simplified Relaxed Lasso and two ad hoc methods. The question of good performance and how to assess it is discussed and the methods are compared on a variety of different assessment measures both new and existing.