Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 151 - Recent Advances in Bayesian Approaches to Neuroimaging
Type: Invited
Date/Time: Tuesday, August 10, 2021 : 10:00 AM to 11:50 AM
Sponsor: WNAR
Abstract #316676
Title: Sketching in Bayesian High Dimensional Regressions with Big Data
Author(s): Rajarshi Guhaniyogi* and Aaron Wolfe Scheffler
Companies: University of California, Santa Cruz and University of California, San Francisco
Keywords: multi-modal image; multi-object regression; mixture models; supervised clustering; primary progressive aphasia; variable selection
Abstract:

Bayesian computation of high dimensional linear regression models with a popular Gaussian scale mixture prior distribution using Markov Chain Monte Carlo (MCMC) or its variants can be extremely slow or completely prohibitive due to the heavy computational cost that grows in the cubic order of p, with p as the number of features. Although a few recently developed algorithms make the computation efficient in presence of a small to moderately large sample size (with the complexity growing in the cubic order of n), the computation becomes intractable when sample size n is also large. We adopt the data sketching approach to compress the n original samples by a random linear transformation to m samples in p dimensions, and compute Bayesian regression with Gaussian scale mixture prior distributions with the randomly compressed response vector and feature matrix. Our proposed approach yields computational complexity growing in the cubic order of m. Another important motivation for this compression procedure is that it anonymizes the data by revealing little information about the original data in the course of analysis. Our detailed empirical investigation with the Horseshoe prior from the class of Gaussian scale mixture priors shows closely similar inference and a massive reduction in per iteration computation time of the proposed approach compared to the regression with the full sample. One notable contribution of this article is to derive posterior contraction rate for high dimensional predictor coefficient with a general class of shrinkage priors on them under data compression/sketching. In particular, we characterize the dimension of the compressed response vector m as a function of the sample size, number of predictors and sparsity in the regression to guarantee accurate estimation of predictor coefficients asymptotically, even after data compression. This is a joint work with Aaron Scheffler.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program