Abstract:
|
Genome-wide measurement of gene expression using microarrays is promising approach to the classification of forms of cancer that are currently not differentiable, but potentially biologically heterogeneous. This type of molecular classification gives hope for highly individualized and more effective prognosis and treatment of cancer. Statistically, molecular classification is a complex hypothesis-generating activity, involving data exploration, modeling, and expert elicitation. We propose a modeling framework that can be used to inform and organize the development of exploratory tools for classification. Our framework uses latent categories to provide both a statistical definition of differential expression, and a precise, experiment-independent, definition of a molecular profile. It also generates natural similarity measures for traditional clustering, and gives probabilistic statements about the assignment of tumor samples to molecular profiles.
|