Name: 2020 Joint Statistical Meetings
Start: 2020-08-02T07:00:00+00:00
End: 2020-08-06

Online Program Home
My Program

All Times EDT

Activity Number:	351 - Variable Selection and Computationally Intensive Methods
Type:	Contributed
Date/Time:	Wednesday, August 5, 2020 : 10:00 AM to 2:00 PM
Sponsor:	Section on Statistical Computing
Abstract #312861
Title:	A Method for Sparse Monothetic Clustering
Author(s):	Paul Harmon* and Mark Greenwood
Companies:
Keywords:
Abstract:	In clustering problems related to high-dimensional data, it is often the case that only a small number of features actually drive meaningful differences in cluster membership while many of the features simply contribute noise. Sparse clustering making use of a Lasso penalty, introduced by Witten and Tibshirani, can be applied to k-means and standard hierarchical clustering procedures. Sparse clustering has been shown to produce more interpretable, accurate clusters than non-sparse implementations. In particular, sparse hierarchical clustering can be achieved by penalizing the distance matrix input to the algorithm. We consider a specific type of hierarchical clustering, monothetic clustering, a divisive method based on recursive partitions of single features at a time. By imposing a Lasso constraint on the distance matrix input to the monothetic clustering, we generate a sparse monothetic clustering. We compare our results on real and simulated datasets to other methods, both sparse and non-sparse.

Authors who are presenting talks have a * after their name.

JSM 2020 Online Program