Online Program

Return to main conference page
Tuesday, January 7
Tue, Jan 7, 2:00 PM - 3:45 PM
Porthole
Novel Methods in Causal Inference

Implementation of a novel machine learning ensemble to identify risk-adjusted facility level behavior in the presence of longitudinal imbalanced data for use in instrumental variable analysis. (307895)

*Evan Paul Carey, Assistant Professor, Health Data Science, St Louis University 
Gary Grunwald, Colorado School of Public Health 

Keywords: instrumental variable analysis, ensemble, machine learning, causal inference

Facility prescribing preference has been used in instrumental variable analyses (IVA) to assess causal relationships between medication exposure and outcomes. This requires identifying high and low prescribing clusters. Studies have often used simple summaries of prior prescribing behavior to define high/low prescribing behavior. However, observational data is typically imbalanced across clusters and patient factors are imbalanced across clusters. We propose a two-stage novel machine learning ensemble to identify the facility level prescribing behavior. The first stage estimate patient exposure probabilities conditional on patient factors. The second stage implements cluster level shrinkage through an elastic net regularized regression on cluster level observed over expected ratios. Ensemble hyper-parameters include alpha and lambda for the strength and shape of cluster shrinkage and the number of prior times points included for each cluster. We test this method in a cohort of 2,229,166 US Veterans reporting chronic pain between 2008-2015, and contrast the resulting instrument accuracy with traditional approaches to IV identification.