2014 Joint Statistical Meetings - Statistics: Global Impact - Past, Present and Future

JSM 2014 Online Program

Online Program Home
My Program

Abstract Details

Activity Number:	9
Type:	Invited
Date/Time:	Sunday, August 3, 2014 : 2:00 PM to 3:50 PM
Sponsor:	Section on Statistical Computing
Abstract #310956	View Presentation
Title:	High-performance Kernel Machines with Implicit Distributed Optimization and Randomization
Author(s):	Vikas Sindhwani*+ and Haim Avron
Companies:	IBM and IBM Research
Keywords:	Kernel Methods ; Randomized Methods ; Sketching ; High-performance Computing ; Distributed Optimization ; ADMM
Abstract:	Complex machine learning tasks arising in several domains increasingly require "big models" to be trained on "big data". Such models tend to grow with the complexity and size of the training data, and do not make strong parametric assumptions upfront on the nature of the underlying statistical dependencies. Kernel methods constitute a very popular, versatile and principled statistical methodology for solving a wide range of non-parametric modelling problems. However, their storage requirements and high computational complexity poses a significant barrier to their widespread adoption in big data applications. We propose an algorithmic framework for massive-scale training of kernel-based machine learning models. Our framework combines two key technical ingredients: (i) distributed general purpose convex optimization for a class of problems involving very large but implicit datasets, and (ii) the use of randomization to significantly accelerate the training process as well as prediction speed for kernel-based models. Our approach is based on a block-splitting variant of the Alternating Directions Method of Multipliers (ADMM) which is carefully reconfigured to handle very large random feature matrices only implicitly, while exploiting hybrid parallelism in compute environments composed of loosely or tightly coupled clusters of multicore machines. Our implementation supports a variety of machine learning tasks by enabling several loss functions, regularization schemes, kernels, and layers of randomized approximations for both dense and sparse datasets, in a highly extensible framework. We study the scalability of our framework on both commodity clusters as well as on BlueGene/Q, and provide a comparison against existing sequential and parallel libraries for such problems.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program

2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.