Name: 2020 Joint Statistical Meetings
Start: 2020-08-02T07:00:00+00:00
End: 2020-08-06

Online Program Home
My Program

All Times EDT

Abstract Details

Activity Number:	168 - SLDS Student Paper Awards
Type:	Topic Contributed
Date/Time:	Tuesday, August 4, 2020 : 10:00 AM to 11:50 AM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #309787
Title:	Classification Accuracy as a Proxy for Two-Sample Testing
Author(s):	Ilmun Kim* and Aaditya Ramdas and Aarti Singh and Larry Wasserman
Companies:	Carnegie Mellon University and Carnegie Mellon University and Carnegie Mellon University and Carnegie Mellon University
Keywords:	Classification accuracy; High-dimensional asymptotic; Linear discriminant analysis; Two-sample testing; Hotelling's test; Minimax optimality
Abstract:	When data analysts train a classifier and check if its accuracy is significantly different from chance, they are implicitly performing a two-sample test. We investigate the statistical properties of this flexible approach in the high-dimensional setting. We first present general conditions under which a classifier-based test is consistent, meaning that its power converges to one. To get a finer understanding of the rates of consistency, we study a specialized setting of distinguishing two Gaussians with different means and a common covariance. By focusing on Fisher's linear discriminant analysis (LDA) and its high-dimensional variants, we provide asymptotic but explicit power expressions of classifier-based tests and contrast them with corresponding Hotelling-type tests. Surprisingly, the expressions for their power match exactly in terms of the parameters of interest, and the LDA approach is only worse by a constant factor.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program

JSM 2020 Online Program

Abstract Details

American Statistical Association