2015 Joint Statistical Meetings - Statistics: Making Better Decisions.

JSM 2015 Home

JSM 2015 Preliminary Program

Abstract Details

Activity Number:	614
Type:	Contributed
Date/Time:	Wednesday, August 12, 2015 : 2:00 PM to 3:50 PM
Sponsor:	Section on Statistical Learning and Data Mining
Abstract #317623	View Presentation
Title:	A Pseudo-Supervised Clustering Approach
Author(s):	Xinying Mu* and Mark Kon
Companies:	Boston University and Boston University
Keywords:	pseudo-supervised clustering ; supervised classification ; confusion matrix ; graph partition
Abstract:	Standard clustering builds models based on distance connectivity. We investigate an alternative clustering method using a so-called pseudo-supervised approach. This optimizes over all possible cluster partition of the data by scoring a partitioning as the accuracy of a machine learning (ML) algorithm in separating the clusters (now viewed as fixed classes) based on ML training and testing. However, this involves a very computationally intensive optimization. We have an algorithm that hybridizes the pseudo-supervised approach with standard clustering, using a graph-based cluster model. We take a large data set and divide it into n small clusters by standard clustering, and then aggregate these n clusters into m (m < n) larger clusters. The aggregation is done using a variant of the above pseudo-supervised method, by identifying a confusion matrix (using machine classifiers such as SVM and random forest) among the n classes obtained in the first clustering step, and using this as a basis for graph clustering. We discuss this algorithm theoretically, and apply it to classifying cancer data sets based on gene expression and spectral bio-marker data.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program

For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home