Online Program

Return to main conference page

All Times ET

Thursday, June 3
Practice and Applications
Classification and Simulation: Methods, Analyses, and Applications
Thu, Jun 3, 10:00 AM - 11:35 AM
TBD
 

Pathway and Gene Selection with Guided Regularized Random Forests (309811)

Daniel Brumley, University of Central Oklahoma 
Sounak Chakraborty, University of Missouri 
*Tyler Cook, University of Central Oklahoma 

Keywords: random forest, regularization, gene selection, pathway selection

Many approaches have been developed in order to model a biological outcome based on microarray data. Much focus has recently been given to incorporating gene interactions via genetic pathway information available in online databases. The additional knowledge of gene relationships may help researchers better understand the biological processes under investigation. We propose a method for pathway and gene selection based on guided regularized random forests (GRRF) that allows for the ranking of both pathways and genes in classification problems. In GRRF, variable importance scores from a random forest guide a regularization procedure to identify a subset of significant predictors. Simulation studies, as well as an analysis of a breast cancer dataset, show that our methodology is successful in identifying a compact set of important pathways and genes with a low prediction error rate.