Online Program Home
My Program

Abstract Details

Activity Number: 507 - Bayesian Data Science and Statistical Science
Type: Topic Contributed
Date/Time: Wednesday, August 1, 2018 : 10:30 AM to 12:20 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #330243 Presentation
Title: Probabilistic Programming with Non-Parametric Bayesian Model Discovery in BayesDB
Author(s): Vikash Mansinghka* and Feras Saad
Companies: and MIT
Keywords: data science; Bayesian non-parametrics; probabilistic programming; model discovery; programming languages; databases

BayesDB is a probabilistic programming platform that provides built-in non-parametric Bayesian model discovery. BayesDB makes it easy for users to search, clean, and model multivariate databases using an SQL-like language. This talk will illustrate capabilities and limitations of the current open-source prototype, using applications to real-world databases of Earth satellites and psychiatric health surveys. It will also discuss new research opportunities at the intersection of probabilistic programming and computational statistics.

Probabilistic programming is an emerging field based on the insight that probabilistic models and inference algorithms are a new kind of software, and therefore amenable to radical improvements in accessibility, productivity, and scale. Unfortunately, most probabilistic programming systems require users to write probabilistic programs by hand. Instead, BayesDB provides a built-in probabilistic program synthesis system that builds generative models for multivariate databases via inference over programs given a non-parametric Bayesian prior. BayesDB also enables statisticians to override these programs with custom statistical models when appropriate.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program