Abstract:
|
Difference in Difference (DD) is a popular econometric technique for identifying and quantifying causal effects in observational data. We introduce GP-DD: a generalization of DD using Gaussian processes for a nonparametric prior over smooth functions. We use maximum marginal likelihood optimization to obtain estimates for the causal effect and a bootstrapping procedure to yield confidence intervals. Using synthetic data we demonstrate that for fixed effects data with iid noise, GP-DD has equivalent accuracy to traditional DD. However, with highly correlated noise, GP-DD is a more efficient estimator than traditional DD. This indicates that GP-DD is particularly well suited for non-iid domains such as spatio-temporal data. Additionally, we develop scalable inference GP-DD models using Kronecker methods to significantly reduce the naive O(N^3) computational burden for Gaussian process optimization, allowing GP-DD to be scaled to massive datasets.
|