JSM 2015 Online Program

Online Program Home
My Program

Abstract Details

Activity Number: 157
Type: Invited
Date/Time: Monday, August 10, 2015 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Mining
Abstract #318027
Title: DeepDive: A Data System for Macroscopic Science
Author(s): Christopher Re*
Companies: Stanford University

Many pressing questions in science are macroscopic, as they require scientists to integrate information from numerous data sources, often expressed in natural languages or in graphics; these forms of media are fraught with imprecision and ambiguity and so are difficult for machines to understand. Here I describe DeepDive, which is a new type of system designed to cope with these problems. It combines extraction, integration and prediction into one system. For some paleobiology and materials science tasks, DeepDive-based systems have surpassed human volunteers in data quantity and quality (recall and precision). DeepDive is also used by scientists in areas including genomics and drug repurposing, by a number of companies involved in various forms of search, and by law enforcement in the fight against human trafficking. DeepDive does not allow users to write algorithms; instead, it asks them to write only features. For some of DeepDive's structured programs, we can prove that the underlying Gibbs Sampling converges in a polynomial amount of time. I will describe our nascent theory in this direction.

DeepDive is open source on github and available from DeepDive.stanford.edu.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program

For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home