Abstract:
|
This talk will provide an overview of recent results in high-dimensional robust estimation. The key question is the following: given a dataset, some fraction of which consists of arbitrary outliers, what can be learned about the non-outlying points? This is a classical question going back at least to Tukey (1960). However, this question has recently received renewed interest for a combination of reasons. First, many of the older results do not give meaningful error bounds in high dimensions (for instance, the error often includes an implicit sqrt(d)-factor in d dimensions). Second, recent connections have been established between robust estimation and other problems such as clustering and learning of stochastic block models. Currently, the best known results for clustering mixtures of Gaussians are via these robust estimation techniques. Finally, high-dimensional biological datasets with structured outliers such as batch effects, together with security concerns for machine learning systems, motivate the study of robustness to worst-case outliers from an applied direction.
|