Abstract:
|
A shortcoming of most black-box supervised learning models is their lack of interpretability or transparency. Partial dependence (PD) plots are the most popular general approach for helping to visualize the effects of the predictor variables, but they can produce incorrect results with strongly correlated predictors, because they require extrapolation beyond the training data envelope. Functional ANOVA for correlated inputs can avoid this extrapolation but involves prohibitive computational expense and subjective choice of additive surrogate model to fit to the supervised learning model. We present a new visualization approach that we term accumulated local effects (ALE) plots, which have a number of advantages over existing methods. First, ALE plots do not require unreliable extrapolation with correlated predictors. Second, they are orders of magnitude less computationally expensive than PD plots, and many orders of magnitude less expensive than functional ANOVA. Third, they yield convenient variable importance/sensitivity measures that possess a number of desirable properties for quantifying the impact of each predictor.
|