Abstract:
|
Extracting causal information from regression-discontinuity (RD) studies, where the treatment assignment rule depends on some type of cutoff formula, may be challenging, especially in the presence of big data. Following Li, Mattei and Mealli (2015), we formally describe RD designs as local randomized experiments within the potential outcome approach. Under this framework causal inference concerns units belonging to some subpopulation where a local overlap assumption, SUTVA and a local randomization assumption (RD assumptions) hold. Unfortunately we do not usually know the subpopulations for which we can draw valid causal inference. We propose to use a model-based finite mixture approach to clustering in a Bayesian framework to classify observations into subpopulations for which we can draw valid causal inference and subpopulations from which we can extract no causal information on the basis of the observed data and the RD assumptions. This approach has important advantages: It explicitly accounts for the uncertainty about sub-population membership; it does not impose any constraint on the shape of the subpopulation; and it properly works in high-dimensional settings.
|