JSM 2004 - Toronto

Abstract #301935

This is the preliminary program for the 2004 Joint Statistical Meetings in Toronto, Canada. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2004); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2004 Program page



Activity Number: 383
Type: Contributed
Date/Time: Wednesday, August 11, 2004 : 2:00 PM to 3:50 PM
Sponsor: Biometrics Section
Abstract - #301935
Title: Prequential Tests in Classification and Regression Trees
Author(s): Jialu Zhang*+ and Francoise Seillier-Moiseiwitsch
Companies: University of Maryland, Baltimore County and University of Maryland, Baltimore County
Address: Dept. of Mathematics and Statistics, Herndon, VA, 20170,
Keywords: prequential test ; classification ; CART ; cross-validation
Abstract:

The data-mining procedure implemented in CART (Classification and Regression Trees) extracts information from the data by constructing binary decision trees. The current CART algorithm exhaustively searches all levels of all covariates and decides on the node-splitting criterion by minimizing node impurity. The node-splitting process goes on until there are too few observations at a node or until the node consists of observations of a single type. Then the large binary tree is pruned back using cross-validation or some other pruning approach. We implement into CART a new model evaluation method, based on prequential testing, to replace the splitting and pruning steps. As a direct consequence, these steps now involve formal statistical tests to validate the decisions. This approach is suited to both parametric and nonparametric decision rules, and assesses models on the basis of the accuracy of their probabilistic predictions for future events. More specifically, data are first divided into a training sample and an evaluation sample. The training sample is extended by one observation at a time to update estimates and make a prediction for the next outcome.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2004 program

JSM 2004 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2004