JSM 2015 Online Program

Online Program Home
My Program

Abstract Details

Activity Number: 461
Type: Invited
Date/Time: Wednesday, August 12, 2015 : 8:30 AM to 10:20 AM
Sponsor: ASA
Abstract #317702
Title: Big Data and Bayesian Nonparametrics
Author(s): Matt Taddy*
Companies: The University of Chicago

Big Data is often characterized by large sample sizes and strange variable distributions. For example, consumer spending on an e-commerce website will have 10-100s million observations weekly with density spikes at zero and elsewhere and very fat right tails. Such spending will also be accompanied by a large set of potential covariates. These properties -- big and strange -- beg for nonparametric analysis. We revisit a flavor of distribution-free Bayesian nonparametrics that approximates the data generating process (DGP) with a multinomial sampling model. This model then serves as the basis for analysis of statistics -- functionals of the DGP -- that are useful for decision making regardless of the true DGP. For example, we'll discuss analysis of a least-squares indexing of treatment effect heterogeneity onto user characteristics, as well as analysis of decision trees developed for fraud prediction. The result is a framework for scalable nonparametric Bayesian decision making on massive data.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program

For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home