Abstract:
|
Many iterative procedures in stochastic optimization are characterized by a transient phase and a stationary phase. One important example is stochastic gradient descent with constant step size. Typically, during the transient phase the procedure moves fast towards a region of interest and during the stationary phase the procedure oscillates around a single stationary point. In this paper, we develop a statistical diagnostic to detect such phase transition. We present theoretical and experimental results suggesting that beyond our estimate of stationarity from the diagnostic the iterates do not depend on the initial starting point. In the context of linear regression models, we derive a closed-form solution describing the region where the diagnostic is activated, and support this theoretical result with simulated experiments. Finally, we suggest an application to speed up convergence of stochastic gradient descent by halving the learning rate each time stationarity is detected. This leads to impressive speed gains that, in preliminary studies, are empirically comparable to state-of-art.
|