Abstract:
|
Bayesian inference provides a coherent approach to learning from data in complex models. However, algorithms for performing inference have not yet caught up to the deluge of data in modern applications. One approach---Bayesian coresets---involves replacing the large dataset with a small, weighted subset of data (a coreset) during inference. Although the methodology is sound in principle, efficiently constructing a coreset remains a significant challenge: current methods tend to be complicated to implement, slow, and require a secondary inference step. In this talk, I will introduce a new method---sparse Hamiltonian flows---that addresses all of these challenges. The method involves first subsampling the data uniformly, and then optimizing a Hamiltonian flow parametrized by coreset weights and including periodic momentum quasi-refreshment steps. I will present results demonstrating that the method enables an exponential compression of the dataset in representative models. Experiments demonstrate that sparse Hamiltonian flows provide accurate posterior approximations with significantly reduced runtime compared with competing dynamical-system-based inference methods.
|