Online Program Home
My Program

Abstract Details

Activity Number: 376
Type: Contributed
Date/Time: Tuesday, August 2, 2016 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #320739
Title: Multicategory Classification Using High-Dimensional Predictors with Applications to Studying Effects of Rice Genome
Author(s): Arkaprava Roy* and Subhashis Ghoshal
Companies: North Carolina State University and North Carolina State University
Keywords: High Dimensional ; Genomics ; Logistic Regression ; Lasso
Abstract:

We develop a multicategory classification technique particularly useful when one type of predictors are present in an unbalanced way and one set of predictors tend to mask the effects of the other type of predictors. Our method is motivated by a problem of classifying rice type in one of the five groups based on its genome data and effects of exogeneous variables. The gene expressions are high dimensional and hence variable selection is needed, but one type of variable should not overwhelm the other type in selection. Experience shows that the effects of gene expressions tend to be shadowed by the macro variables which dominate in a sparse classification procedure. We address the issue by explaining macrovariables by their respective sparse regression residuals and then consider all variables together in a variable-selection-cum-classification procedure using a high dimensional penalized logistic regression framework. We proceed by selecting one variable at a time in a forward selection framework with an objective function that includes also a penalty term. The proposed approach is shown to select very sparse models without losing predictive power.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association