Abstract:
|
Suppose we wish to explore visually an n by p matrix of real numbers where n and p are quite large (say, n ~10^9 and p ~ 10^4). We present a new algorithm for subsetting data matrices that makes this exploration feasible. We select a subset of rows and columns of X_np : X_np -> X[a,b]_mk, where m < < n and k < < p and a is a row index array of length m and b is a column index array of length k. We restrict our selection of X_mk to be distance-preserving, where distances between the rows of X_mk are linearly related to the distances between the corresponding rows of X_np.
|