Abstract:
|
When a bivariate sample is heterogeneous in correlation, Pearson’s correlation, Kendall’s tau and other classical methods can give overall correlation but cannot express the true correlation structure. By ranking samples according to Kendall’s tau, a descending tau-path can be derived to use in finding correlated subsets, if such subsets exist. The current tau-path method only detects correlated subsets among uncorrelated samples (H0: population homogeneously uncorrelated); the more general scenario when the samples are heterogeneously correlated with different non-zero correlations (H0: population homogeneously correlated) was not addressed. In this paper, we propose two methods: empirical copula-based tau-path method and truncated geometric distribution based moving average maximum likelihood method to test whether the sample comes from a homogeneously correlated population without assuming zero correlation and to identify differently correlated subsets. Simulations show that the proposed methods have controlled type I error and reasonable power. The methods applied to gene expression data showed utility in uncovering heterogeneous co-expression that missed with standard methods.
|