Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 244 - Advances in Statistical Machine Learning
Type: Contributed
Date/Time: Tuesday, August 9, 2022 : 8:30 AM to 10:20 AM
Sponsor: IMS
Abstract #323380
Title: Benign Overfitting Without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data
Author(s): Spencer Frei* and Niladri Chatterji and Peter L. Bartlett
Companies: University of California, Berkeley and Stanford University and University of California, Berkeley
Keywords: statistical learning theory; deep learning; overfitting; interpolation; neural networks
Abstract:

Deep learning has revealed a surprising statistical phenomenon: the possibility of 'benign' overfitting. Experiments have shown that trained neural networks are capable of simultaneously (1) overfitting to datasets that have substantial amounts of random label noise and (2) generalizing well to unseen data, a behavior that is inconsistent with the familiar bias-variance tradeoff in classical statistics. In this talk we investigate this phenomenon theoretically for two-layer neural networks trained by gradient descent on the cross-entropy loss. We assume the data comes from well-separated class-conditional distributions and allow for a constant fraction of the training labels to be corrupted by an adversary. We show that in this setting, neural networks indeed exhibit benign overfitting: despite the non-convex nature of the optimization problem, the empirical risk is driven to zero, overfitting the noisy labels; and as opposed to the classical intuition, the networks simultaneously generalize near-optimally. In contrast to previous works on benign overfitting that require linear or kernel-based predictors, our analysis holds in a setting where both the model and learning dynam


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program