Online Program

Return to main conference page
Saturday, May 19
Data Science
Time-based Models
Sat, May 19, 8:30 AM - 10:00 AM
Lake Fairfax B
 

Causal Inference from Observational Time Series Data (304637)

Iavor Bojinov, Harvard 
Min Liu, LinkedIn 
*Iris Tu, LinkedIn 
Ya Xu, LinkedIn 

Keywords: Causal Inference, Social Networks, Time Series, Longitudinal Study

Randomized experiments (A/B testings) have become the standard way for web-facing companies to evaluate new products, guide product development and prioritize ideas. There are times, however, when running an experiment is too complicated, costly, and time-consuming; this is where observational data can be used to quickly and cheaply obtain a reasonable estimate of the causal effect. Social networks, such as Facebook and LinkedIn, often employ strategies that encourage users to contribute (e.g., post, comment, like and send private messages) because of the belief that this will increase a user's visitation frequency, which in turn will maximize revenue. Measuring the effect of contributions on a user's propensity to return to the site and the impact on their neighborhood (i.e., the users they are connected to) can be done using experiments, but requires encouragement design (since we cannot directly force users to contribute) and developing non-standard experimentation infrastructure. In this setting, however, we have an abundance of historical data measuring a user's contribution and her visitation pattern, that can be used to estimate the causal effect. In this paper, we present an approach based on linear fixed effects models to estimate the contemporaneous (or instantaneous) causal effect of a user's contribution on her own and her neighbors' subsequent visits. By using temporal data and a carefully designed observational study, we can reduce the self-selection bias and obtain accurate measures of the causal effect. Using LinkedIn data for several million members, we further show that by using temporal data we can remove more of the bias than state of the art single time snapshot methods such as propensity score stratification, inverse propensity score weighting and doubly-robust methods.