The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Online Program Home
Abstract Details
Activity Number:
|
175
|
Type:
|
Contributed
|
Date/Time:
|
Monday, July 30, 2012 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Section on Statistical Computing
|
Abstract - #304511 |
Title:
|
Visualization of Big Data Sets Using Package Bigvis
|
Author(s):
|
Yue Hu*+ and Hadley Wickham
|
Companies:
|
Rice University and Rice University
|
Address:
|
1717 Bissonnet St, Houston, TX, 77005, United States
|
Keywords:
|
data visualization ;
large data sets
|
Abstract:
|
Visualizing large data sets in R is a challenge. The problem of overplotting arises when using scatterplots to explore the correlation among variables or to identify clusters in the data, and there are memory and speed limitations of R. This talk proposes new methods of displaying large data sets and provides fast methods for binning and smoothing the data. Instead of visualizing each individual point, we visualize the densities, which can be efficiently obtained by binning and smoothing in any number of dimensions.
Our tool works with both base R data frames data and Revolution's proprietary XDF format. Revolution's RevoScaleR is a commercial product built on top of R that provides fast statistical analysis for data too big to fit into memory. Our preliminary works allow us to do exploratory analysis of data sets of up to 100 million observations, each plot taking under one minute to produce.
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2012 program
|
2012 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.