Abstract:
|
We propose and study a class of simple, nonparametric, yet interpretable measures of association between two random variables $X$ and $Y$ taking values in general topological spaces. These nonparametric measures - defined using the theory of reproducing kernel Hilbert spaces - capture the strength of dependence between $X$ and $Y$. They are 0 if and only if the variables are independent and 1 if and only if one variable is a measurable function of the other. They can be consistently estimated using the general framework of geometric graphs which include $k$-nearest neighbor graphs and minimum spanning trees. Moreover, a sub-class of these estimators are also shown to adapt to the intrinsic dimensionality of the underlying distribution. Some of these empirical measures can also be computed in near linear time. Under the hypothesis of independence between $X$ and $Y$, these empirical measures (properly normalized) have a standard normal limiting distribution. Thus, these measures can also be readily used to test the hypothesis of mutual independence between $X$ and $Y$. We will extend this framework to contruct similar measures of conditional association.
|