Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 484 - Methods for High-Dimensional Data in Genetics and Genomics
Type: Contributed
Date/Time: Thursday, August 6, 2020 : 10:00 AM to 2:00 PM
Sponsor: Biometrics Section
Abstract #309582
Title: Robust Estimation of Large Covariance Matrix for Compositional Data with Application to Microbial Inter-Taxa Analysis
Author(s): Arun Srinivasan* and Danning Li and Lingzhou Xue and Xiang Zhan
Companies: Penn State University and Jilin University and Penn State University and NISS and Pennsylvania State University
Keywords: Elliptical Distribution; Shape Matrix; Tyler's M-Estimation; Huber's M-Estimation; Microbiome; High-Dimensional Data

Compositional data structures naturally arise across various research areas. The accurate estimation of the latent covariance matrix is a key task in compositional data analysis. The real-world compositional datasets are often littered with the complications such as compositional structure, high-dimensionality, heavy tails, and possible outliers. To address these challenges, we propose a new robust estimation procedure for the latent shape matrix of high-dimensional heavy-tailed compositional data, which is a scalar multiple of the latent covariance matrix when it exists. The proposed method allows for a broad class of elliptical distributions to model the latent log-basis variables and introduces a positive-definite robust estimation of the large latent shape matrix based on the celebrated Tyler's M-estimator (Tyler 1987) and Huber's M-estimator (Huber 1964). We prove the theoretical guarantees for the proposed method under the high-dimensional setting, including the selection consistency, sign consistency, convergence rate, and expected risk bound. We demonstrate the performance of our method through simulation studies and a real application to microbial inter-taxa analysis.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program