Abstract:
|
This work considers parameter estimation in the two-component symmetric Gaussian mixtures in $d$ dimensions with $n$ independent samples. We show that, even in the absence of any separation between components, with high probability, the EM algorithm converges to an estimate in at most $\sqrt{n}$ iterations, which is within $O((d/n)^{1/4} (log n)^{3/4}$ in Euclidean distance to the true parameter, provided that $n=\Omega(d)$. This is within a logarithmic factor to the minimax optimal rate of $(d/n)^{1/4}$. The proof relies on establishing (a) a non-linear contraction behavior of the population EM mapping (b) concentration of the EM trajectory near the population version.
This is joint work with Harrison Zhou (Yale).
|