Abstract:
|
Gaussian variational inference and the Laplace approximation are popular alternatives to Markov chain Monte Carlo that formulate Bayesian inference as an optimization problem. A key limitation of both methods is that the global optimum of the optimization problem is typically not tractable; even in simple settings the problem is nonconvex. Thus, recently developed statistical guarantees, which all involve the data asymptotic properties of the global optimum, are not reliably obtained in practice. In this work, we provide two major contributions: a theoretical analysis of the asymptotic convexity properties of variational inference with a Gaussian family and the maximum a posteriori problem required by the Laplace approximation; and two algorithms: consistent Laplace approximation and consistent stochastic variational inference, that exploit these properties to find the optimal approximation in the asymptotic regime. Both CLA and CSVI involve a tractable initialization procedure that finds the local basin of the optimum, and CSVI further includes a scaled a scaled gradient descent algorithm that provably stays locally confined to that basin.
|