Abstract:
|
Randomized dimension reduction has recently become a powerful tool in machine learning, numerical linear algebra and signal processing. We consider random projection methods in the context of non-convex optimization problems for machine learning and statistics. First, we introduce a statistical model where the maximum likelihood estimator reduces to fitting a single layer neural network. We prove that a second order optimization method with a suitable initialization recovers the global optimum under certain assumptions on the data. We then introduce random projection, sampling and distributed optimization strategies that enable solving large scale non-convex learning problems faster than existing methods.
|