Online Program Home
  My Program

Abstract Details

Activity Number: 40 - Statistical Learning: Theory and Methods
Type: Contributed
Date/Time: Sunday, July 30, 2017 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #322578
Title: Statistical Distances and Two-Sample Multivariate Goodness-of-Fit Tests
Author(s): Yang Chen* and Marianthi Markatou and Georgios Afendras and Bruce George Lindsay
Companies: University at buffalo, Department of Biostatistics and University at buffalo, Department of Biostatistics and University at buffalo, Department of Biostatistics and The Pennsylvania State University, Department of Statistics
Keywords: Statistical distance ; Robustness ; Goodness of fit test ; High dimensional and large sample size data
Abstract:

The goodness-of-fit problem has a long history in the statistical literature, in both the univariate and multivariate cases. And there is a growing trend of using statistical distances in the construction of goodness-of-fit tests. In this presentation, we discuss the fundamental role of statistical distances in robustness. Specifically, we discuss the class of chi-squared measures and show that Pearson's chi-squared and Neyman's chi-squared distances can be interpreted as supremums of Z and t statistics. We then review various existing multivariate two-sample tests from both statistics and machine learning literature, and offer a critical analysis of the performance of these tests. The simulation experiments we conduct for the tests which have readily available R packages deliver the fact that there does not exist a satisfactory test for high dimensional, large sample size data. Two real data sets are also used to illustrate the application of these tests. Discussion and recommendations pertaining to the use of these methods are also provided, with emphasis on the relative performance of the methods under the conditions studied.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association