Activity Number:
|
281
- Bayesian Methods for Complex Data Analysis
|
Type:
|
Topic-Contributed
|
Date/Time:
|
Wednesday, August 11, 2021 : 1:30 PM to 3:20 PM
|
Sponsor:
|
International Society for Bayesian Analysis (ISBA)
|
Abstract #317185
|
|
Title:
|
Bayesian Dynamic Feature Partitioning in High-Dimensional Regression with Big Data
|
Author(s):
|
Rene Gutierrez* and Rajarshi Guhaniyogi
|
Companies:
|
UC Santa Cruz and University of California, Santa Cruz
|
Keywords:
|
Bayesian Statistics;
Data Shards;
High Dimensional Regression;
Shrinkage Prior;
Sufficient Statistics;
Streaming Data
|
Abstract:
|
Bayesian computation of high dimensional linear regression models using MCMC or its variants can be extremely slow or completely prohibitive since these methods perform costly computations at each iteration of the sampling chain. Furthermore, this computational cost cannot usually be efficiently divided across a parallel architecture. These problems are aggravated if the data size is large or data arrive sequentially over time. This article proposes a dynamic feature partitioned regression (DFP) approach for efficient online inference for high dimensional linear regressions with large or streaming data. DFP constructs a pseudo posterior density of the parameters at every time point, updating the pseudo posterior when new data arrives. DFP updates the pseudo posterior at every time point suitably and partitions the set of parameters to exploit parallelization for efficient posterior computation. The proposed approach is applied to Gaussian scale mixture priors and spike and slab priors and is found to yield state-of-the-art inferential performance. DFP enjoys theoretical support with pseudo posterior densities being arbitrarily close to the full posterior as new data arrives.
|
Authors who are presenting talks have a * after their name.