Abstract:
|
Gene Set Enrichment Analysis (GSEA) is a computational tool that incorporates knowledge in a prior defined gene sets (e.g. known biological mechanisms) to data-driven gene expression analysis. However, current methods ignore the heavy gene overlapping among multiple gene sets. For time-course data, we propose a new algorithm called "FUNNEL-GSEA" based on functional data analysis. FUNNEL borrows temporal information from neighboring time points and decomposes the overlapping genes by functional extensions of principal component analysis and elastic-net regression. We also establish an equivalence between penalized concurrent functional regression and penalized high-dimensional multivariate regression, which greatly boosts the computational efficiency. Furthermore, we introduce a weighted Mann-Whitney U test for the gene-set-level hypothesis testing, which can also be useful in general circumstances. By applying FUNNEL in both simulations and a large-scale time-course gene expression data on human influenza infection, the proposed algorithm shows uniformly better ROC curves and identifies more relevant gene sets to influenza than competing approaches. FUNNEL-GSEA is a free R package.
|