Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 251 - Statistical Advances in Microbiome Research from Theory to Application
Type: Invited
Date/Time: Tuesday, August 4, 2020 : 1:00 PM to 2:50 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #309585
Title: Optimal Estimation of Wasserstein Distance on a Tree with an Application to Microbiome Studies
Author(s): Shulei Wang* and Tony Cai and Hongzhe Li
Companies: University of Pennsylvania and University of Pennsylvania and University of Pennsylvania
Keywords: Estimation of non-smooth functional; Phylogenetic tree; Polynomial approximation

The weighted UniFrac distance, a plug-in estimator of the Wasserstein distance of read counts on a tree, has been widely used to measure the microbial community difference in microbiome studies. Our investigation however shows that such a plug-in estimator, although intuitive and commonly used in practice, suffers from potential bias. Motivated by this finding, we study the problem of optimal estimation of the Wasserstein distance between two distributions on a tree from the sampled data in the high-dimensional setting. The minimax rate of convergence is established. To overcome the bias problem, we introduce a new estimator, referred to as the moment-screening estimator on a tree (MET), by using implicit best polynomial approximation that incorporates the tree structure. The new estimator is computationally efficient and is shown to be minimax rate-optimal. Numerical studies using both simulated and real biological datasets demonstrate the practical merits of MET, including reduced biases and statistically more significant differences in microbiome between the inactive Crohn’s disease patients and the normal controls.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program