Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 546 - Foundational Issues in Machine Learning
Type: Topic Contributed
Date/Time: Thursday, August 6, 2020 : 1:00 PM to 2:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #313875
Title: Sparse Logistic Classification Using Multi-Type Predictors
Author(s): Arkaprava Roy* and Bertrand Clarke and Subhashis Ghosal and Diego Jarquin
Companies: University of Florida and U. Nebraska-Lincoln and NCSU and UNL

We consider a binary classification problem where two different types of predictors, one low-dimensional typically standing for phenotypic characteristics and another high-dimensional, typically standing for genomic information, are present. In such situations, if variable selection is conducted in the usual way, the preponderance of the predictors in the latter category may overwhelm the predictors of the former type. To address this issue, we propose a new selection mechanism, to be called the one-pass method, in which we first apply sparse regression on the low-dimensional predictors with respect to the high-dimensional ones and replace the former by the regression residuals. Then we perform sparse logistic regression with all predictors using a penalized forward selection method to obtain the final classifier. To ensure that the variables from the low dimensional residuals are not overwhelmed by the high dimensional data, the variables in the low dimensional data to enter the sparse logistic regression first. We show that this procedure is numerically effective in simulations as well as computationally efficient. Our applications are chiefly in agronomy.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program