Online Program Home
My Program

Abstract Details

Activity Number: 122 - Novel Statistical Methods in the Analysis of Big Data
Type: Topic Contributed
Date/Time: Monday, July 29, 2019 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Computing
Abstract #307137
Title: Modified Multidimensional Scaling
Author(s): Qiang Sun*
Companies: University of Toronto
Keywords: clustering; multidimensional scaling; noisy data

Classical multidimensional scaling is an important tool for data reduction in many applications. It takes in a distance matrix and outputs low-dimensional embedded samples such that the pairwise distances between the original data points can be preserved, when treating them as deterministic points in the ambient space. When data are noisy, we found that the quality of the embedded samples produced by classical multidimensional scaling starts to break down, when either the ambient dimensionality or the noise variance is large. This motivates us to propose the modified multidimensional scaling procedure which applies a nonlinear shrinkage to the sample eigenvalues. The nonlinear transformation depends on the dimensionality, sample size and moment of noise. We show that modified multidimensional scaling followed by various clustering algorithms can achieve exact recovery, i.e., all the cluster labels can be recovered correctly with probability tending to one. Numerical simulations and two real data applications lend strong support to our proposed methodology.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program