Abstract:
|
In the past few decades, many clustering algorithms have been developed to cluster genes in a species based on their expression values under different conditions. In comparative genomics, a common strategy is to link genes in different species by homology. An interesting question is how to simultaneously identify gene clusters in two species by combining homology and gene expression information.
Most existing cluster algorithms partition bipartite graphs without considering the node covariates, aiming to assign all nodes into clusters. However, for identifying clusters of homologous genes study, we are concerned about detecting the most relevant matches of bipartite genes. We propose a new algorithm based on bipartite spectral clustering and tight clustering. We formulate a gene homology network between two species as a bipartite graph, with nodes on each side representing genes in each species and node covariates as gene expression values. Our goal is to identify tight and stable co-clusters of gene simultaneously in both species with strong homology and similar expression patterns.
|