Gene expression profiling of single cells (scRNAseq) has refined and defined new cell types and states. Initial experiments generated many cellular replicates, but focus is increasingly turning towards population-based studies that rely on nested designs in which a cohort of individuals is repeatedly measured by sampling their cells. It has also been observed that even putatively independent designs will generate dependent data when batch effects are present.
I apply a previously-described two-part, zero-inflated log-Normal random effects model for these dependent data. The variance parameters may be weakly identified in a given transcript, but related across the various transcripts measured. To leverage this property, I propose a hierarchical model that adaptively shrinks the variance parameters towards a global value. This model is shown to be applicable also to traditional (bulk) RNAseq experiments that produce dependent data by replacing the observed likelihood with a zero-inflated negative binomial distribution.
|