Online Program

Return to main conference page
Friday, May 18
Data Visualization
Dynamic Data Visualization
Fri, May 18, 5:15 PM - 6:15 PM
Grand Ballroom F
 

Exploratory Data Analysis for Predictive Analytics (304647)

*Mia Stephens, JMP/SAS 
Ruth M Hummel, JMP/SAS 

Keywords: Exploratory data analysis, analytics, classification trees, logistic regression, decision tree, boosted trees

Decision trees are used in predictive modeling for a number of reasons: Trees handle large problems (many observations, many predictors), work with categorical predictors (with many levels), can handle messy or unruly data, are easily interpretable, and can be used for both predictive and exploratory purposes.

In this talk we use a case study to illustrate an intuitive approach to introducing classification trees. We start by exploring variables one at a time, then introduce interactive tools for exploring potential bivariate and multivariate relationships. We fit a logistic regression model, and use the JMP Prediction Profiler to understand model effects and explore interactions. Then, we use insights gained through exploratory analysis and logistic regression to motivate the introduction of tree-based methods. We see how to build and interact with decision trees, how to interpret decision tree models, and discuss how to extend tree-based models using bootstrapping (bootstrap forest) and boosting (boosted trees).