Abstract:
|
Unlike supervised learning which is broadly studied for feature selection and dimension reduction techniques, dealing with high dimensionality in unsupervised learning is still a challenging problem depending on data types and fields. To deal with this problem, we propose a new variable selection method that is efficiently searching intrinsic feature groups based on the recently developed hidden Markov model with variable blocks (HMM-VB) clustering. We recast HMM-VB as a subspace clustering technique and then conduct feature selection task in the process of variable block construction for HMM-VB. The proposed feature selection with HMM-VB method provides a systematic way to categorize feature space into intrinsic, irrelevant, redundant, or noisy feature groups to pick out the parsimonious set of informative features in clustering. Experiments on simulated studies as well as real biology data are provided.
|