|
Activity Number:
|
572
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Thursday, August 6, 2009 : 8:30 AM to 10:20 AM
|
|
Sponsor:
|
Biopharmaceutical Section
|
| Abstract - #304610 |
|
Title:
|
Deforestation of a Random Forest
|
|
Author(s):
|
Matthew Mitchell*+
|
|
Companies:
|
Metabolon, Inc.
|
|
Address:
|
P.O. Box 110407, Research Triangle Park, NC, 27709,
|
|
Keywords:
|
Random Forest ; CART ; variable selection
|
|
Abstract:
|
In metabolomics data, where n is much less than p, random forest analysis is an excellent tool to assess the true predictive ability of potential biomarkers. The random forest can also be a useful tool for variable selection in screening data. However, after obtaining screening data, we often want to produce a simple, easily interpretable set of rules for the final prediction for any future data set. A forest with potentially thousands of trees and hundreds of variables where many variables produce multiple trees with different splitting points is not useful for clinical prediction. We propose a permutation-based method to first select the truly predictive variables from the random forest. Then we use a weighted CART algorithm to produce a small easily interpretable prediction algorithm.
|
- The address information is for the authors that have a + after their name.
- Authors who are presenting talks have a * after their name.
Back to the full JSM 2009 program |