Abstract:
|
In distribution compression, one aims to accurately summarize a probability distribution $P$ using a small number of representative points. Near-optimal thinning procedures achieve this goal by sampling n points from a Markov chain and identifying $\sqrt{n}$ points with $O(1/\sqrt{n})$ discrepancy. Unfortunately, these algorithms suffer from quadratic or super-quadratic runtime. To address this deficiency, we introduce Compress++, a simple meta-procedure for speeding up any thinning algorithm while suffering at most a factor 4 in error. When combined with the quadratic-time kernel thinning algorithms of Dwivedi and Mackey (2021), Compress++ delivers $\sqrt{n}$ points with $O(\sqrt{ log n /n })$ integration error and better-than-Monte-Carlo maximum mean discrepancy in $O(n log^3 n)$ time. Moreover, Compress++ enjoys the same near-linear runtime given any quadratic-time input and reduces the runtime of super-quadratic algorithms by a square-root factor. In our benchmarks with high-dimensional Monte Carlo samples and Markov chains targeting challenging differential equation posteriors, Compress++ nearly matches the accuracy of its input algorithm in orders of magnitude less time.
|