Online Program Home
My Program

Abstract Details

Activity Number: 526
Type: Topic Contributed
Date/Time: Wednesday, August 3, 2016 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Computing
Abstract #319907 View Presentation
Title: Carpe Datum! Bill Cleveland's Contributions to Data Science and Big Data Analysis
Author(s): Steve Scott*
Companies: Google Analytics
Keywords: Consensus Monte Carlo ; data science ; big data ; divide and recombine ; Tessera
Abstract:

In 2001, Bill Cleveland published ``Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.'' The plan was 6 pages long, and well ahead of its time. The subsequent 15 years have seen birth of data science as a field. The rise of data science was partly driven by the contemporaneous rise of ``big data,'' the perceived need to analyze (either directly by a human, or in an automated way by machines) very large automatically collected data sets. Bill has championed the ``divide and recombine'' strategy of handling big data problems, embodied by the modeling and visualization software Tessera. In this talk I will briefly review Cleveland's 2001 ``Action Plan,'' emphasize the distinction between data science and big data, and discuss the advantages of the divide-and-recombine strategy for very large data problems.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association