JSM 2004 - Toronto

Abstract #300239

This is the preliminary program for the 2004 Joint Statistical Meetings in Toronto, Canada. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2004); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2004 Program page



Activity Number: 96
Type: Invited
Date/Time: Monday, August 9, 2004 : 10:30 AM to 12:20 PM
Sponsor: SSC
Abstract - #300239
Title: An Out-of-bag Method for Regularizing Boosted Regression
Author(s): Greg Ridgeway*+
Companies: RAND Corporation
Address: 1700 Main Street, Santa Monica, CA, 90407-2138,
Keywords: nonparametric regression ; boosting ; out-of-bag estimation ; regularization
Abstract:

Selecting the optimal number of iterations is a key step when using any of the various flavors of boosting. The standard practice for estimating the optimal number of iterations is to leave out some fraction of the dataset as a test set and iterate the boosting algorithm until the predictive performance on the test set no longer improves. This practice, however, allocates a large part of the dataset for estimating the optimal number of iterations diluting the amount of information available for building the model structure. A variation on out-of-bag estimation can provide an approximately unbiased estimate of the improvement in generalization error attributable to the current iteration without decreasing the amount of data available for learning the model structure. When the out-of-bag estimate of improvement is zero, the iterations stop. The R package "gbm" implements this technique and I demonstrate on several datasets that this procedure offers a fully automated process for fitting boosting models, requiring fewer iterations and offering improved predictive performance.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2004 program

JSM 2004 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2004