Abstract:
|
Hypothesis testing for feature relevance to prediction becomes crucial to explainable machine learning. In this paper, we derive one-split and two-split tests, for feature relevance of a black-box models such as a deep neural network. The one-split test uses an empirical loss difference with perturbation based on one splitting of the whole sample into two parts, called estimation and inference samples for training a black-box learner and constructing a test statistic respectively. The two-split test further splits the inference sample but does not require perturbation. Moreover, we derive combined tests by aggregation of p-values based on a limited number of random splitting. Furthermore, we develop an adaptive scheme to estimate a splitting ratio and a perturbation level to control the Type I error. Our theoretical analysis and simulations indicate that the one-split test is more powerful, and the combined tests can compensate a power loss. Numerically, we show that the proposed tests effectively reveal the dependency between hypothesized features and prediction. All tests are implemented by the proposed Python library dnn-inference (https://dnn-inference.readthedocs.io).
|