JSM Preliminary Online Program
This is the preliminary program for the 2009 Joint Statistical Meetings in Washington, DC.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2009 Program page




Activity Number: 572
Type: Contributed
Date/Time: Thursday, August 6, 2009 : 8:30 AM to 10:20 AM
Sponsor: Biopharmaceutical Section
Abstract - #304610
Title: Deforestation of a Random Forest
Author(s): Matthew Mitchell*+
Companies: Metabolon, Inc.
Address: P.O. Box 110407, Research Triangle Park, NC, 27709,
Keywords: Random Forest ; CART ; variable selection
Abstract:

In metabolomics data, where n is much less than p, random forest analysis is an excellent tool to assess the true predictive ability of potential biomarkers. The random forest can also be a useful tool for variable selection in screening data. However, after obtaining screening data, we often want to produce a simple, easily interpretable set of rules for the final prediction for any future data set. A forest with potentially thousands of trees and hundreds of variables where many variables produce multiple trees with different splitting points is not useful for clinical prediction. We propose a permutation-based method to first select the truly predictive variables from the random forest. Then we use a weighted CART algorithm to produce a small easily interpretable prediction algorithm.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2009 program


JSM 2009 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised September, 2008