Abstract:
|
Distance correlation is popular for testing independence: the sample statistic is straightforward to compute, works for any metric or kernel choice, and asymptotically equals zero if and only if independence. One major bottleneck is the testing process: the null distribution of distance correlation depends on the metric choice and marginal distributions and cannot be easily estimated. To compute a p-value, the standard approach is to estimate the null distribution via permutation, which is very costly for large amount of data. In this paper, we propose a chi-square distribution to approximate the null distribution of the unbiased distance correlation. We prove that the chi-square distribution either equals or well-approximates the null distribution, and always upper tail dominates the null distribution. The resulting distance correlation chi-square test works with any strong negative type metric or characteristic kernel, is valid and universally consistent for testing independence, enjoys a similar finite-sample testing power as the standard permutation test, and is provably the most powerful test among all valid distribution-based tests of distance correlation.
|