Abstract:
|
High throughput spatial transcriptomics (HST) is a rapidly emerging experimental technology that allows for spatially resolved gene expression profiling at the single cell level. With HST data, we seek to identify sub-populations within a tissue sample that reflect biological cell types or states. Existing methods ignore the spatial dependence in gene expression, fail to account for features such as skewness and heavy-tails, or are heuristic-based methods that lack the benefits of statistical models. To address this gap, we develop SPRUCE: a Bayesian spatial multivariate finite mixture model based on multivariate skew-normal and skew-t distributions, to identify sub-populations in HST data. We implement a novel combination of PĆ³lya-Gamma data augmentation and spatial random effects to infer spatially correlated mixture component membership probabilities. We evaluate the performance of SPRUCE through comprehensive simulation studies and its application to mouse brain HST data. The R package spruce, an efficient R implementation of the proposed models, is currently available from our research group GitHub repository (https://dongjunchung.github.io/spruce/).
|