Abstract:
|
Most clustering methods are completely unsupervised and perform poorly in high-dimensional settings. In this talk, we consider a semi-supervised clustering approach that incorporates covariate information. In order to optimize the clustering, we consider pre-conditioning using linear transformations (such as projections and stretching) to align and direct clustering algorithms with known covariate subgroupings. The optimization criterion for the clustering is based on the variation of information (Meilia 2007). This extension on identifying possible subgroups based on current labels has multiple applications in medical research fields that collect functional data. For example, related diagnostic groups may share common functional trajectories and some trajectories may be unique to a particular diagnostic group. Our approach is motivated to optimize clustering in these settings.
|