Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 215 - Contributed Poster Presentations: Section on Statistical Learning and Data Science
Type: Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #313844
Title: Feature Selection for Support Vector Regression Using a Genetic Algorithm
Author(s): Shannon McKearnan* and David Vock and Julian Wolfson
Companies: University of Minnesota and University of Minnesota and University of Minnesota
Keywords: support vector regression; variable selection; genetic algorithm
Abstract:

Support vector regression (SVR) is particularly beneficial when the outcome and predictors have a non-linear relationship. However, when many covariates are available, the method’s flexibility can lead to overfitting and an overall loss in predictive accuracy. To overcome this, we develop a feature selection method for SVR based on a genetic algorithm that iteratively searches across potential subsets of variables to find those that yield the best performance according to a user-defined fitness function. We evaluate the performance of our feature selection method for SVR, comparing it to alternate methods including LASSO and random forest, in a simulation study. We find that our method yields higher predictive accuracy than SVR without feature selection. Our method outperforms LASSO when the relationship between covariates and outcome is non-linear. Random forest performs equivalently to our method in some scenarios, but more poorly in the case of correlated covariates. In addition, we apply our method to predict forced expiratory volume at one year after lung transplant using data from the United Network for Organ Sharing national registry.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program