Abstract:
|
In order to explain what a black box algorithm does we can start by studying which variables are important for its decisions. Variable importance is studied by making hypothetical changes to predictor variables. Changing parameters one at a time can produce input combinations that are outliers or very unlikely. They can be physically impossible, or even logically impossible. It is problematic to base an explanation on outputs corresponding to impossible inputs. We introduced the cohort Shapley (CS) measure to avoid this problem, based on Shapley value from cooperative game theory. There are many tradeoffs in picking a variable importance measure, so CS is not the unique reasonable choice. One interesting property of CS is that it can detect `redlining', meaning the impact of a protected variable on an algorithm's output when that algorithm was trained without the protected variable.
This talk is based on recent joint work with Masayoshi Mase, Ben Seiler and Chris Hoyt. The opinions expressed are my own, and not those of Stanford, the National Science Foundation, or Hitachi, Ltd.
|