JSM 2004 - Toronto

Abstract #300713

This is the preliminary program for the 2004 Joint Statistical Meetings in Toronto, Canada. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2004); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2004 Program page



Activity Number: 22
Type: Contributed
Date/Time: Sunday, August 8, 2004 : 2:00 PM to 3:50 PM
Sponsor: IMS
Abstract - #300713
Title: Tree-based Extrapolation Diagnostics for Machine Learning
Author(s): Giles Hooker*+
Companies: Stanford University
Address: Statistics Dept. , Stanford University, CA, 94305,
Keywords: CART ; diagnostics ; extrapolation ; tree ; machine learning ; nonparametric
Abstract:

Machine-learning applications typically occur in very high-dimensional contexts and usually involve datasets of highly correlated predictors. This results in the hypercube bounding the data containing large unpopulated regions. Moreover, many learning procedures are highly flexible, meaning that their behavior is uncontrolled in these regions. This is a problem when the distribution of predictor variables shifts. It is also a concern for functional diagnostics which often require the function to be evaluated on a product measure. We present a diagnostic for extrapolation as a test statistic for a point originating from the data distribution against a uniform null hypothesis. This allows us to employ general classification methods to estimate this statistic. Further, we observe that CART can be given an exact distribution as an argument to provide a more stable estimate. This is the basis of our extrapolation-detection procedure. We explore some advantages of this approach and present examples of it working in practice.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2004 program

JSM 2004 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2004