Abstract:
|
High-throughput methods such as RNA-Seq has been shown to be useful to understand the disease mechanisms. Recent development allowed deep learning to predict phenotypes using all genes expressed simultaneously. However, the traditional deep learning approach will ignore the biological basis in modeling. To tackle this problem, we proposed a three-stage deep learning approach to better understand the disease biology. First, we explore a Graph Embedding Deep Feedforward Network (GEDFN) to integrate the known gene regulatory network structures into the deep neural network architecture. The goal of GEDFN is to obtain the weight/strength between each pair of the associated genes. Then, we employed the Random Walk with Restart (RWR) method to obtain a good relevance score among each gene with all others from the weighted graph in GEDFN and obtained the proximities of the target gene to all the rest genome. Finally, by ranking the weighted utility scores for all genes, we will be able to have a meaningful biomarkers' identification guidance for a particular disease. A real example of Lupus trial data demonstrates the feasibility of applying GEDFN and RWR for identifying biomarkers.
|