Abstract:
|
Analysis of clinical effects and genetic effects can have a huge impact on disease prediction. Particularly, it is important to identify significant genetic pathway effects associated with biomarkers. In this work, we consider a problem of variable selection in a semiparametric regression model which can study the effects of clinical covariates and expression levels of multiple gene pathways. We model the unknown high-dimension functions of multi-pathways via multi-Gaussian kernel machines to consider the possibility that genes within the same pathway interact with each other. Hence, our variable selection can be considered as Gaussian process selection. We develop our Gaussian process selection under the Bayesian variable selection framework. We incorporate prior knowledge for structural pathways by imposing an Ising prior on the model. Our approach can be easily applied in high-dimensional space where the sample size is smaller than the number of genes and covariates. To fit this model, rather than using Markov chain Monte Carlo (MCMC), we devise an efficient variational Bayes algorithm. Three simulations show us that our method has great power to catch significant pathways.
|