Abstract:
|
Understanding heterogeneous treatment effects lies at the heart of several modern scientific and engineering challenges, ranging from personalized medicine to customized marketing offers and recommendations. In this paper, we develop a non-parametric causal forest estimator for such treatment effects as a function of an individual's characteristics; our method is closely inspired by Breiman's widely used random forest predictor. We show that causal forests are pointwise consistent for the true treatment effect, and have an asymptotically Gaussian and centered sampling distribution. Moreover, the asymptotic variance of causal forests can be accurately estimated in practice, meaning that causal forests can be used for valid statistical inference about heterogeneous treatment effects. Our theoretical results rely on a generic asymptotic normality theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows random forests to be used for valid statistical inference, even for classification or regression forests in the standard prediction context. In simulations, we find causal forests to be substantially more powerful than KNN.
|