Abstract:
|
The high dimensional nature of genome wide transcription makes the expectation of effect size an elusive concept, including both the proportion of genes undergoing differential expression (DE) and the distribution of the effect sizes across genes that do have DE. This challenge, together with the complex structure of RNAseq data, the technical factors influencing the quality and precision of measurements, makes traditional power calculation and sample size determination impractical. We advocate simulation based power evaluation that keeps the high dimensional nature of power functions for different genes, such that users can visualize the impact of sample size as well as other choices (such as sequencing depth) in experimental design. As single cell studies become more prevalent, an additional level of choice complicates the experimental design: namely the number of biological samples and the number of cells in each sample. We will present power analysis for single cell RNAseq data and discuss the impact of various factors based on semi-parametric simulation.
|