Abstract:
|
Learning heterogeneous treatment groups is often done in experimental settings. Drawing noise-free inference about heterogeneous treatment effects using experimental data, however, is problematic when the sample size is not large enough to rule out noise. In our study, we adopt a two-stage approach to propose and test heterogeneous treatment effects. In Stage 1, we use a large observational dataset to learn sub-groups with the most distinctive treatment-outcome relationships ('high/low-impact subgroups'). We adopt a model-based recursive partitioning approach to propose the high/low impact subgroups, and validate them by using a sample-splitting approach to propose "noise-free" sub-groups. While the first stage rules out noise, we still have to deal with the potential bias in our sub-groups. Stage 2 uses an experimental design, and here we classify our sample units based on the sub-groups learned in Stage 1. We then estimate the unbiased treatment effects within each of the groups using a difference-in-differences approach, thereby testing the causal hypotheses proposed in Stage 1. We also extend our approach to non-parametric estimation of the sub-groups.
|