Abstract:
|
We examined 165 Arabidopsis data sets from 18 different labs and identified a list of most stably expressed genes using a Poisson regression model with random effects. Comparison of this top 5000 list with previous results shows a general consistency between RNA-Seq and microarray studies. In addition, by virtue of GLM modeling, the variation is partitioned into between-experiment, between-treatment and between-sample level. We found that the major source of variation comes from between experiment, followed by between-treatment variation, whereas between-sample variation contributes the least to total variation. For the top 100 stably expressed genes identified, a factor analysis was performed to examine if additional unobserved factors still existed. We found no evidence supporting the existence of latent variable. The stably expressed genes that we identified can give more stable and robust RNA-Seq count normalization.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.