Abstract:
|
We study identifiability and consistency in nonparametric mixture models as well as closely related mixture of regression (also known as mixed regression) models, where the regression functions are allowed to be nonparametric and the error distributions are assumed to be convolutions of a Gaussian density. We construct uniformly consistent estimators under general conditions while simultaneously highlighting several pain points in extending existing pointwise consistency results to uniform results. The resulting analysis turns out to be nontrivial, and several novel technical tools are developed along the way. We also establish a super- polynomial lower bound on the sample complexity of learning the component distributions in such models. The proof relies on a fast rate for approximation with Gaussians, which may be of independent interest. This result has important implications for the hardness of learning more general nonparametric latent variable models that frequently arise in machine learning applications.
|