Activity Number:
|
57
- Nonparametric Modeling I
|
Type:
|
Contributed
|
Date/Time:
|
Sunday, August 8, 2021 : 3:30 PM to 5:20 PM
|
Sponsor:
|
Section on Nonparametric Statistics
|
Abstract #318032
|
|
Title:
|
Two-Sample Testing in High Dimension via Maximum Mean Discrepancy
|
Author(s):
|
Hanjia Gao* and Xiaofeng Shao
|
Companies:
|
Department of Statistics, University of Illinois at Urbana-Champaign and University of Illinois at Urbana-Champaign
|
Keywords:
|
maximum mean discrepancy (MMD);
two sample testing;
central limit theorem;
Berry-Esseen bound;
power analysis;
energy distance
|
Abstract:
|
Maximum Mean Discrepancy (MMD) has been widely used in the areas of machine learning and statistics to quantify the distance between two distributions in p-dimensional Euclidean space. The asymptotic behaviors of the MMD test statistics have been well studied when the dimensionality $p$ is fixed and several approximation methods have been developed in the literature. As motivated by the increasing dimensionality of the data from many scientific areas, we propose to investigate the behavior of MMD test statistics in a high-dimensional environment. Specifically, we investigate the limit of sample MMD as both the dimension $p$ and sample size $n$ diverge to infinity for a wide range of kernels, including popular Gaussian and Laplacian kernels. Our results also cover energy distance as a special case. We also derive the explicit rate of convergence under mild assumptions and provide some theoretical analysis regarding the power. Numerical simulations demonstrate the effectiveness of our proposed test statistic and the normal approximation.
|
Authors who are presenting talks have a * after their name.