![IconGems-Print](images/IconGems-Print.png)
540 – Statistical Computing for Machine Learning
Information Based Clustering of Gene Expression Signatures in Primary Breast Carcinoma Patients
Milan Bimali
University of Kansas Medical Center
Michael Brimacombe
University of Kansas Medical Center
The application of a novel clustering approach is developed that takes into account the structure of gene expression profiles in relation to the distributional assumption as well as information based similarity among gene expressions in the data. It is assumed that the gene expression profile for each subject follows a known distribution and thus a set of relative likelihood functions (likelihood functions rescaled by their mode) can be constructed. The relative likelihood functions thus obtained are further weighted (scaled) by the observed Fisher information to incorporate information related accuracy across the gene expression profiles. The subjects are then eventually clustered based on a distance matrix reflecting the weighted relative likelihoods and applying standard clustering methods.