JSM 2013 Home
Online Program Home
My Program

Legend: Palais des congrès de Montréal = CC, Le Westin Montréal = W, Intercontinental Montréal = I
A * preceding a session name means that the session is an applied session.
A ! preceding a session name means that the session reflects the JSM meeting theme.

Activity Details


CE_32T Wed, 8/7/2013, 10:00 AM - 11:45 AM W-St. Antoine
Data Mining with TreeNet (Stochastic Gradient Boosting) and Random Forests, Including the Latest Refinements and Model Compression Techniques — Continuing Education CTW
ASA , Salford Systems
Instructor(s): Mikhail Golovnya, Salford Systems
This workshop discusses key algorithmic details of Breiman's RF and Friedman's TreeNet, and important extensions to bagging/boosting technology. RF and TreeNet are new advances to classification and regression trees, enabling the modeler to construct predictive models of extraordinary accuracy. RF is a tree-based procedure making use of bootstrapping and random feature selection. In TreeNet, classification and regression models are built gradually through a large collection of small trees, each of which improves on its predecessors through an error-correcting strategy. Recent developments include model compression techniques ISLE and RuleLearner, and gradient-boosting/RF crossovers (gradient boosting incorporating core RF ideas). ISLE is a model compression technology to simplify and speed up complex tree-based ensembles. RuleLearner reinterprets TreeNet and/or RF tree ensembles, extracting individual segments described by interesting rules. The rules can be combined to yield compressed models, often more accurate than the original ensembles. RuleLearner supports individual-specific and group-specific variable importance rankings and offers dependency plots for model interpretation. We will show how the software is used to solve real-world problems, cover theory, discuss what is novel, illustrate how to select an ideal balance between model complexity and predictive accuracy, and show where the software fits in terms of other data mining software.



2013 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.