Abstract:
|
Unsupervised clustering is widely used in the analysis of biological data. In biological field, clusters identified computationally are often taken as new findings, so it is crucial to question the validity of the clusters. Although the ultimate evaluation of a cluster has to depend on additional biological experiments, we can evaluate clusters from certain perspectives based on the existing data and reveal the information hidden in high dimension. We have developed the covering point set (CPS) analysis method, a validation tool to quantify and provide visualization for the uncertainty associated with a cluster. CPS Analysis integrates seamlessly into the pipelines commonly used for biological data including algorithm-based clustering or selection of visualization methods for a given data set.
|