Keywords: RNA-seq, Power analysis, Interactivity, Shiny
The vast amount of RNA-seq data deposited in Gene Expression Omnibus (GEO) is still a grossly underutilized resource for biomedical research. Re-analysis of these data can lead to novel scientific insights and it has been routinely used to inform the design of new studies. However, reuse of GEO RNA-seq data is made difficult by the complexity of the processing protocols and analytical tools which are often inaccessible to biomedical scientists not specializing in bioinformatics. To remove technical roadblocks for reusing these data, we have developed a web-application GREIN (GEO RNA-seq Experiments Interactive Navigator) which provides user-friendly interfaces to process and analyze GEO RNA-seq data. GREIN is powered by the back-end computational pipeline for uniform processing of RNA-seq data and the large number (>7,000) of already processed datasets. The front-end user interfaces provide a wealth of user-analytics options including sub-setting and downloading processed data, examination of quality control measures and interactive visualization of expression patterns in the whole dataset, sample size and power analysis for the purpose of informing experimental design of future studies, construction of differential gene expression signatures and their comprehensive functional characterization, and connectivity analysis with LINCS L1000 data. Besides standard two-group comparison, the differential gene expression analysis module also supports fitting of a generalized linear model that accounts for covariates or batch-effects. Graphical user interfaces for GREIN are implemented in Shiny and is deployed via robust Docker swarm of load-balanced of Shiny servers. The combination of the massive amount of back-end data and front-end analytics options driven by user-friendly interfaces makes GREIN a unique open-source resource for re-using GEO RNA-seq data. GREIN is accessible at: https://shiny.ilincs.org/grein.