Online Program Home
  My Program

Abstract Details

Activity Number: 143 - Advancing Translational Research Using Novel Statistical Analyses for Complex and Omics Data
Type: Invited
Date/Time: Monday, July 31, 2017 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #321955 View Presentation
Title: Composition Estimation from Sparse Count Data via a Regularized Likelihood
Author(s): Hongzhe Li* and Yuanpei Cao and Anru Zhang
Companies: University of Pennsylvania and University of Pennsylvania and University of Wisconsin-Madison
Keywords: metagenomics ; microbiome ; matrix completion ; under sampling ; simplex
Abstract:

In microbiome studies, taxa composition is often estimated based on the sequencing read counts in order to account for the large variability in the total number of observed reads across different samples. Due to sequencing depth, some rare microbial taxa might not be captured in the metagenomic sequencing, which results in many zero read counts. Naive composition estimation using count normalization therefore lead many zero proportions, which underestimates the underlying compositions, especially for the rare taxa. In this paper, the observed counts are assumed to be sampled from a multinomial distribution, with the unknown composition being the probability parameter in a high dimensional positive simplex space. Under the assumption that the composition matrix is approximately low rank, a nuclear norm regularization-based likelihood estimation is developed to estimate the underlying compositions of the samples. The theoretical upper bounds and the minimax lower bounds of the estimation errorsmeasured by the Kullback-Leibler divergence and the Frobenius norm are established. Simulations and real data analysis will be presented.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association