Activity Number:
|
593
- Computationally Intensive and Machine Learning Methods
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 1, 2018 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Computing
|
Abstract #328598
|
|
Title:
|
Weighted Stochastic Gradient Descent Algorithm
|
Author(s):
|
Xueying Tang* and Zhi Wang and Jingchen Liu
|
Companies:
|
Columbia University and Columbia University and Columbia University
|
Keywords:
|
active learning;
stochastic gradient descent;
weighted sampling
|
Abstract:
|
Stochastic gradient descent (SGD) is one of the most popular optimization algorithms in large-scale machine learning applications. In each iteration of the classic SGD, the estimated solution to the optimization problem is updated based on a uniformly sampled observation. In this article, we propose to sample observations according to a distribution that is adaptively determined by minimizing the trace of the variance-covariance matrix of the stochastic updates. A subset sampling scheme is used to reduce the computational burden of computing the distribution while retaining the power of weighted sampling. We demonstrate the superior performance of the proposed method through several simulated examples and a real knowledge graph completion example. This method can also be applied to adaptive experiment design or active learning problems in which case the sampling probability for each candidate is proportional to the square root of the trace of its Fisher information matrix.
|
Authors who are presenting talks have a * after their name.