Feature Selection for Support Vector Regression Using a Genetic Algorithm (306501)*Shannon B McKearnan, University of Minnesota
David M Vock, University of Minnesota
Julian Wolfson, University of Minnesota
Keywords: support vector regression, variable selection, genetic algorithm
Support vector regression (SVR) is a modification of support vector machines for use with continuous outcomes. SVR models are particularly beneficial when the outcome and predictors have a non-linear relationship; a kernel mapping of the predictors to a high-dimensional feature space can be applied to transform the data. However, when many covariates are available, such as in genetic or imaging analyses, the method’s flexibility can lead to overfitting and an overall loss in predictive accuracy. To overcome this, we develop a feature selection method for SVR based on a genetic algorithm, an optimization technique inspired by evolutionary processes. The genetic algorithm iteratively searches across potential subsets of variables to find those that yield the best performance according to a user-defined fitness function. We evaluate the performance of our feature selection method for support vector regression, comparing it to alternate methods including LASSO and random forest, in a simulation study. In addition, we apply the method to risk prediction for oropharyngeal cancer patients utilizing high-dimensional imaging data.