Online Program

Return to main conference page
Friday, May 18
Data Visualization
Data Visualization Platforms
Fri, May 18, 1:30 PM - 3:00 PM
Grand Ballroom F
 

The Interactive Solution Path in JMP Pro: A Powerful Tool for Visualizing and Exploring Model Diagnostics (304715)

*Chris Gotwalt, JMP 

Keywords: Variable Selection, Model Diagnostics, Interactive Visualization, Lasso, Forward Selection, Generalized Linear Models

Variable selection, the process of deciding which variables or terms should be placed into a model, is a critical component of many data analyses. The choice of variables impacts the accuracy of the model predictions in a variety of ways. A model with too few variables will underfit the data and not take advantage of all the information available. On the other hand, keeping too many variables leads to models that generalize poorly to new data because multicollinearity among the input variables drastically increases the variance of the predictions. Over the years many diagnostic tools for model fitting have been developed, leading to the common practice of iteratively fitting a sequence of models with software. This proceeds by fitting one model, examining some diagnostics, making changes by adding or removing terms, and then fitting the new model. More recently, automated procedures that accelerate this process, such as the Lasso and Forward Selection have become popular. However, considering tradeoffs between different models is still a manual process. The Generalized Regression Platform in JMP Pro with its Interactive Solution Path makes this quick and easy. It plots the regression coefficients and model selection criteria as they evolve during a variable selection algorithm. This shows when terms enter and leave the model, as well as the strength and directionality of the relationship with the response variable. The plots have sliders that adjust the model that is used in all the other diagnostic plots. This makes it trivially simple to see how adding and removing terms impacts residual plots, actual by predicted plots, and the variance of the predictions. We find this interactive approach to modeling compelling and find that it both streamlines the modeling process and importantly makes modeling much easier to teach to students.