|
Activity Number:
|
166
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Monday, August 3, 2009 : 10:30 AM to 12:20 PM
|
|
Sponsor:
|
Section on Nonparametric Statistics
|
| Abstract - #305481 |
|
Title:
|
Random Forests versus Logistic Regression: A Comparison Using Real and Simulated Data
|
|
Author(s):
|
Kathy L. Gray*+
|
|
Companies:
|
California State University, Chico
|
|
Address:
|
400 West First St, Chico, CA, 95929,
|
|
Keywords:
|
random forests ; logistic regression ; measurement error ; model performance
|
|
Abstract:
|
The goal of this study was to compare model performance between the newer technique known as random forest and the more familiar technique, logistic regression. The random forest procedure, an extension of classification trees, involves resampling both observations and variables to obtain a collection of trees that can be used for prediction. There remain many unexplored questions about the performance of the random forest model. Generated data sets were used to compare error rates of the two methods under a variety of conditions. A sensitivity study was conducted to gauge the robustness of the model in the presence of measurement error. The results show that, under certain scenarios, the random forest procedure outperforms logistic regression.
|
- The address information is for the authors that have a + after their name.
- Authors who are presenting talks have a * after their name.
Back to the full JSM 2009 program |