Name: 2018 Joint Statistical Meetings
Start: 2018-07-28T07:00:00+00:00
End: 2018-08-02
Location: Vancouver Convention Centre

Abstract Details

Activity Number:	39 - Topics in Clustering
Type:	Contributed
Date/Time:	Sunday, July 29, 2018 : 2:00 PM to 3:50 PM
Sponsor:	Section on Statistical Computing
Abstract #327028	Presentation
Title:	Sparse Convex Clustering
Author(s):	Binhuan Wang* and Yilong Zhang and Will Wei Sun and Yixin Fang
Companies:	New York University School of Medicine and Merck Research Laboratories and University of Miami School of Business Administration and New Jersey Institute of Technology
Keywords:	Convex clustering; Finite sample error; Group LASSO; High-dimensionality; Sparsity
Abstract:	Convex clustering has drawn recent attentions since it nicely addresses the instability issue of traditional non-convex clustering methods. Although its computational and statistical properties have been recently studied, the performance of convex clustering has not yet been investigated in the high-dimensional clustering scenario, where the data contains a large number of features and many of them carry no information about the clustering structure. In this paper, we demonstrate that the performance of convex clustering could be distorted when the uninformative features are included in the clustering. To overcome it, we introduce a new clustering method, referred to as Sparse Convex Clustering, to simultaneously cluster observations and conduct feature selection. The key idea is to formulate convex clustering in a form of regularization, with an adaptive group-lasso penalty term on cluster centers. In order to optimally balance the trade-off between the cluster fitting and sparsity, a tuning criterion based on clustering stability is developed. Theoretically, we obtain a finite sample error bound for our estimator and further establish its variable selection consistency.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program

JSM 2018 Online Program

Abstract Details

American Statistical Association