Abstract:
|
Gene-based modeling approaches of cancer phenotypes have these drawbacks that because of the sheer large number of genes included in the analyses, such models often yield very small effect sizes and lack in reproducibility. Analyses based on sets of genes that fall within a biological pathway can overcome these drawbacks (Holmans, 2010). In this work, we consider a pathway-based Bayesian generalized linear regression model for the prediction of cancer phenotypes with non-local prior on the regression coefficients. Specifically, we propose a novel method of summarizing gene expressions into pathway scores. Our method adaptively estimates the dependency structure in the gene expressions of different genes. We use these pathway scores as covariates in the regression model. In addition to prediction, we perform Bayesian variable selection with the pathways present in the highest posterior probability model to identify significant pathways. For application, we consider two real datasets -- a kidney cancer dataset available from TCGA and a breast cancer dataset available from METABRIC project. Comparison is made with existing gene set variation analysis (GSVA) scores based analysis.
|