Abstract:
|
Single-cell RNA-seq data enable scientists to study cell developmental trajectories, but the statistical properties of these methods are not well developed. In this article, we study the identifiability and convergence of embedding each cell into a lower dimensional space, an important component of cell trajectory estimation methods. Specifically, we develop eSVD (exponential family SVD), which estimates an embedding for each cell with respect to a hierarchical model where the inner product between latent vectors is the natural parameter of an exponential family distributed random variable. Our estimation procedure uses an alternating minimization approach , and we investigate the identifiability conditions. Our method is similar to other matrix factorization methods, but we adapt its underlying algorithm and statistical theory to be more amendable for single-cell analyses. We apply eSVD via Gaussian distributions where the standard deviations are proportional to the means to aid in analyzing the oligodendrocytes in fetal mouse brains. We provide diagnostics and simulations to demonstrate the validity of both our method and results.
|