Abstract:
|
This paper presents a method for optimal matching in very large observational studies. With current matching algorithms, we can only find an optimal matching for at most several thousand treated subjects at a time, due to time and computer memory limits. In current practice, large problems are broken into many smaller problems for matching, making the overall match suboptimal. In this paper, we introduce the use of optimal calipers to remove edges from the original dense network and reduce it to a sparse network. The edge set no longer includes all pairs of treated and control subjects, and instead is restricted to pairs that are close based on the caliper. The caliper is optimal in the sense that it is as small as possible such that a matching exists. In addition, we meet the goal of near-fine balance, which balances nominal covariates as closely as possible without creating infeasibility, and the match minimizes a covariate distance among matches that respect the caliper and are near-fine. A large matched case-control study with millions of individuals illustrates these features in detail. A new R package called matchscalable implements the method.
|