Abstract:
|
A priori information, such as biological pathways, is a useful supplement in identifying risk factors of a trait using genomic data. However, the commonly used methods to incorporate prior information provide a model for the mean function of the outcome and rely on unmet linear assumptions. To address these concerns, we propose a method for variable selection in nonparametric additive quantile regression with network regularization to incorporate the information encoded by known networks. We implement the group Lasso penalty to obtain a sparse model. We define the network-constrained penalty by the total L-2 norm of the difference between the effect functions of any two predictors that are linked in the known network. We further propose an efficient computation procedure to solve the optimization problem that arises in our model. Simulation studies show that our proposed method performs well in identifying more truly associated variables and less falsely associated variables than alternative approaches. We apply the proposed method to analyze the microarray gene-expression dataset in the Framingham Heart Study and identify several body mass index associated genes.
|