Hospital benchmarking typically uses regression to adjust for differences in patient illness severity, but hospitals with non-overlapping populations may not be accurately compared. Standard diagnostics for regression do not involve checking covariate distribution overlap. This approach compares a given hospital against a “typical” hospital and may not depict a hospital’s patient case mix. Instead, we used a hospital’s patient population and a matched cohort of similar patients treated elsewhere. We used the eICU Collaborative Research Database - a freely available multi-center database for critical care research. We included 120 hospitals with at least 300 hospitalizations (n=138,557). For each hospital, we used 1:5 propensity score matching to find similar patients treated elsewhere. We included age, sex, diagnosis, comorbidities, and laboratory measures known to be associated with mortality in the propensity score model. In-hospital mortality was compared at each hospital relative to its benchmark by using a multilevel model to account for the clustering of the control group. Three hospitals had significantly different mortality than their matched comparison (2 lower, 1 higher).