Abstract:
|
This paper studies distributed estimation and inference for a general statistical problem with a convex loss that could be non-differentiable. For the purpose of efficient computation, we restrict ourselves to stochastic first-order optimization, which enjoys low per-iteration complexity. To motivate the proposed method, we first investigate the theoretical properties of a straightforward Divide-and-Conquer Stochastic Gradient Descent (DC-SGD) approach. Our theory shows that there is a restriction on the number of machines and this restriction becomes more stringent when the dimension p is large. To overcome this limitation, this paper proposes a new multi-round distributed estimation procedure that approximates the Newton step only using stochastic subgradient. Instead of estimating the population Hessian matrix that usually requires the second-order differentiability of the loss, the proposed First-Order Newton-type Estimator (FONE) is applicable to non-differentiable losses. Our estimator also facilitates the inference for the empirical risk minimizer.
|