Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 466 - Contemporary Statistical Graphics: Methods and Applications
Type: Contributed
Date/Time: Thursday, August 6, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistical Graphics
Abstract #309880
Title: Computing a Distance-Preserving Submatrix Algorithm to Enable Visualizations of Large Rectangular Data Sets
Author(s): Leland Wilkinson*
Companies: H2O
Keywords: big data; visualization; projections
Abstract:

Suppose we wish to explore visually an n by p matrix of real numbers where n and p are quite large (say, n ~10^9 and p ~ 10^4). We present a new algorithm for subsetting data matrices that makes this exploration feasible. We select a subset of rows and columns of X_np : X_np -> X[a,b]_mk, where m < < n and k < < p and a is a row index array of length m and b is a column index array of length k. We restrict our selection of X_mk to be distance-preserving, where distances between the rows of X_mk are linearly related to the distances between the corresponding rows of X_np.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program