Online Program Home
My Program

Abstract Details

Activity Number: 577 - Statistical Methods for Interpreting Machine Learning Algorithms - with Implications for Targeting
Type: Topic Contributed
Date/Time: Wednesday, August 1, 2018 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #329806 Presentation
Title: Beyond Feature Attribution: Quantitative Concept-Based Interpretability with TCAV
Author(s): Been Kim*
Companies: Google Brain
Keywords: interpretable machine learning; explanations

Neural networks commonly offer high utility but remain difficult to interpret. Developing methods to explain their decisions is challenging due to their large size, complex structure, and inscrutable internal representations. This work argues that the language of explanations should be expanded from that of input features (e.g., assigning importance weightings to pixels) to include that of higher-level, human-friendly concepts. For example, an understandable explanation of why an image classifier outputs the label "zebra" would ideally relate to concepts such as "stripes" rather than a set of particular pixel values. This paper introduces the "concept activation vector" (CAV) which allows quantitative analysis of a concept's relative importance to classification, with a user-provided set of input data examples defining the concept. CAVs may be easily used by non-experts, who need only provide examples, and with CAVs the high-dimensional structure of neural networks turns into an aid to interpretation, rather than an obstacle. We show results in two wildly used image prediction network as well as applications in medical domain (diabetic retinopathy).

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program