Online Program Home
My Program

Abstract Details

Activity Number: 47 - Highlights from Bayesian Analysis
Type: Invited
Date/Time: Sunday, July 28, 2019 : 4:00 PM to 5:50 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #300013 Presentation
Title: Big Data Bayesian Linear Regression and Variable Selection by Normal-Inverse-Gamma Summation
Author(s): Hang Qian*
Companies: The MathWorks, Inc.
Keywords: Conjugate prior; Hierarchical shrinkage; MapReduce
Abstract:

We introduce the normal-inverse-gamma (NIG) summation operator, which combines Bayesian regression results from different data sources and leads to a simple split-and-merge algorithm for big data regressions. The NIG summation operator satisfies commutativity and associativity with an identity element. Regression data can be processed in an embarrassingly parallel fashion, and online updating for flow data is justified. The summation operator is also useful for computing the marginal likelihood and facilitates Bayesian model selection (BMS) methods, including Bayesian LASSO, stochastic search variable selection, Markov chain Monte Carlo model composition, etc. Observations are scanned in one pass and then the sampler iteratively combines NIG distributions without reloading the data. Computational complexity analysis shows that NIG summations help BMS run almost as fast as the OLS regression, if the sample size is large. Simulation studies demonstrate that our algorithms efficiently handle highly correlated big data. A real-world data set on employment and wage is also analyzed.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program