Abstract:
|
As it progresses within a patient, cancer evolves into multiple subpopulations of cancer cells, some co-located and some at distant sites. Genome sequencing of cancer tissue has great promise for understanding how cancers develop and for personalized treatment, however, in bulk sequencing, the observations come from a mixture of multiple subpopulations as well as normal cells. The problem of inferring the distinct subpopulations and their phylogenetic relationships can be formulated as a tree-structured clustering problem in which the number of clusters and the tree topology are unknown. We propose a novel method of inference for this problem that is advantageous in terms of computation and accuracy, and handles complicating issues that arise in practice.
|