JSM 2016 Online Program

Activity Number:	104
Type:	Invited
Date/Time:	Monday, August 1, 2016 : 8:30 AM to 10:20 AM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #321533
Title:	Is Manifold Learning (Finding Low-Dimensional Nonlinear Embeddings for High-Dimensional Data) Impractical for Large Data?
Author(s):	Dominique Perrault-Joncas* and James McQueen and Marina Meila and Zhongyue Zhang and Jake VanderPlas
Companies:	Google and University of Washington and University of Washington and University of Washington and University of Washington
Keywords:
Abstract:	From a statistical standpoint, manifold learning converges at a non-parametric rate of n to the negative power 1/(alpha x dimension + beta). Thus, accurate manifold learning typically requires large amounts of data. Unfortunately, from a computational standpoint, it is commonly believed that manifold learning algorithms "have poor scaling properties". In this talk, we will discuss the question of scalability for manifold learning algorithms such as non-linear dimension reduction and semi-supervised learning via Gaussian Processes. We also present a python package that makes the former practical for data sets in the millions.

Authors who are presenting talks have a * after their name.