Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 174 - Statistical Optimality in High-Dimensional Models and Tradeoffs with Computational Complexity, Privacy and Communication Constraints
Type: Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
Sponsor: IMS
Abstract #312573
Title: A Precise High-Dimensional Asymptotic Theory for Boosting
Author(s): Pragya Sur* and Tengyuan Liang
Companies: Harvard University and University of Chicago
Keywords: boosting; classification; margin; interpolation; over-parametrization
Abstract:

This paper establishes a precise high-dimensional asymptotic theory for Boosting on separable data, taking statistical and computational perspectives. We consider the setting where the number of features (weak learners) p scales with the sample size n, in an over-parametrized regime. On the statistical front, we provide an exact analysis of the generalization error of Boosting, when the algorithm interpolates the training data and maximizes an empirical L1 margin. The angle between the Boosting solution and the ground truth is characterized explicitly. On the computational front, we provide a sharp analysis of the stopping time when Boosting approximately maximizes the empirical L1 margin. Furthermore we discover that, the larger the margin, the smaller the proportion of active features (with zero initialization). At the heart of our theory lies a detailed study of the maximum L1 margin, using tools from convex geometry. The maximum L1 margin can be precisely described by a new system of non-linear equations, which we study using a novel uniform deviation argument. Preliminary numerical results are presented to demonstrate the accuracy of our theory.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program