JSM Preliminary Online Program
This is the preliminary program for the 2006 Joint Statistical Meetings in Seattle, Washington.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2006 Program page




Activity Number: 368
Type: Topic Contributed
Date/Time: Wednesday, August 9, 2006 : 8:30 AM to 10:20 AM
Sponsor: Biopharmaceutical Section
Abstract - #305997
Title: Application of RandomForest as a Variable Selection Tool on Biomarker Data
Author(s): Katja Remlinger*+
Companies: GlaxoSmithKline
Address: 5412 Silver Moon Lane, Raleigh, NC, 27606,
Keywords: cross validation ; classification ; selection bias ; prediction model ; supervised learning ; high dimensional data
Abstract:

In supervised learning problems involving high-dimensional data, it is often desirable to reduce the number of variables. Biomarker data is usually high-dimensional and therefore a good candidate for variable selection approaches. Identifying a small number of important markers is not only essential to achieve high accuracy for the prediction model, but also necessary to allow an easy interpretation of the model from a biological point of view. Using biomarker data from an obesity study, we will illustrate the challenges involved in the model building and marker selection process. In particular, we will use a modified version of the RandomForest Wrapper Algorithm (Svetnik et al. 2004) to build models based on a small number of important markers and make comparisons to other approaches.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2006 program

JSM 2006 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised April, 2006