Online Program Home
My Program

Abstract Details

Activity Number: 444
Type: Contributed
Date/Time: Tuesday, August 2, 2016 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Graphics
Abstract #320977
Title: Visualizing the Classification Model Produced by a Projection Pursuit Random Forest
Author(s): Natalia Da Silva* and Dianne Cook
Companies: Iowa State University and Monash University
Keywords: PPforest ; visualization ; classification
Abstract:

This paper introduces new tools in the R package PPforest for visualizing projection pursuit classification random forest objects. PPforest is an ensemble learning method. It adapts the classic random forest to utilize combinations of variables, as produced by projection pursuit, in the tree construction. Utilizing linear combinations of variables to separate classes takes the correlation between variables into account, and can outperform the classic forest when separations between groups exist in combinations of variables. PPforest is available on \url{https://github.com/natydasilva/PPforest}. Visualization is useful to help obtain an understanding of the class structure in the data and how the model fits it. Because the PPforest is composed of many tree fits on subsets of the data, a lot of statistics are calculated and this produces essentially a separate data set. Some of the diagnostics of interest are the same as in the classic forest, but calculated differently: variable importance, OOB error rate, vote matrix and proximity matrix. Static, dynamic and interactive plots will be used with this data, linked to the training data, to better understand the fitted PPforest model


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association