Activity Number:
|
222
|
Type:
|
Invited
|
Date/Time:
|
Monday, August 1, 2016 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Computing
|
Abstract #318072
|
|
Title:
|
Some Ideas Left Out of CART
|
Author(s):
|
Padraic Grantier Neville*
|
Companies:
|
SAS Institute
|
Keywords:
|
Decision Tree ;
CART ;
Breiman
|
Abstract:
|
The authors of the 1984 CART book and software had experimented with many ideas for decision trees before publishing. I will present some of the ideas of Leo Breiman that were left out. They are recorded in 220 pages of memoranda written in 1978 and 1979. One idea is to model a response variable that has many classes by using hyperplanes defined by eigenvectors of a covariance matrix. This idea breaks with Leo's customary advocacy of interpretable splitting rules. In another memo he shows that there are natural splitting criteria in which the optimal splitting rules send observations fractionally to both child nodes. Every observation may have a positive probability of being in every node. Leo also discusses tree-based density estimation (with Young) and shrinking the estimates in the child nodes towards those in the parent node. We also note that, more than a decade before C4.5 does, Jerry Friedman handles missing values of the splitting variable by sending a fraction of the observation into both child nodes.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2016 program
|