Abstract:
|
Permutational non-Euclidean analysis of variance, also known as PERMANOVA, is routinely used in exploratory analysis of microbiome datasets to draw conclusions about the significance of patterns visualized through dimension reduction. This method recognizes that a pairwise distance matrix between observations is sufficient to compute within and between group sums of squares necessary to form the (pseudo) F statistic. Moreover, not only Euclidean, but arbitrary distances can be used. PERMANOVA is highly effective in testing the omnibus hypothesis in data with uniform multivariate spread across factor levels. When data are heteroscedastic and design is unbalanced, PERMANOVA may result in type I error inflation and in power reduction. To overcome this shortcoming, we explore the original ideas of B.L. Welch for univariate analysis of variance in presence of heteroscedasticity. We demonstrate that Welch statistics for arbitrary k-level factors can be recovered from any pairwise distance matrix. Empirically, permutation test using the distance-based Welch statistics is more powerful than PERMANOVA and control type I error better in unbalanced heteroscedastic data.
|