JSM 2002

Activity Number:	197
Type:	Topic Contributed
Date/Time:	Tuesday, August 13, 2002 : 10:30 AM to 12:20 PM
Sponsor:	Section on Statistical Computing*
Abstract - #300189
Title:	Advances in Predictive Data Mining
Author(s):	Jerome Friedman*+ and Jacqueline Meulman
Affiliation(s):	Stanford University and Leiden University
Address:	, , , ,
Keywords:
Abstract:	A dissimilarity measure for value-attribute data is proposed for use in cluster analysis. It assigns small dissimilarities to observation pairs that have close values on any subset of the attribute variables regardless of their values on the complement set of variables. Using this measure in conjunction with dissimilarity-based clustering algorithms encourages the detection of subgroups of observations that preferentially cluster on subsets of the variables. The relevant variable subsets for each individual cluster can be different and partially (or completely) overlap with those of other clusters. Enhancements for increasing sensitivity for detecting especially low cardinality groups clustering on a small subset of variables are discussed. Applications in several different domains, including gene expression arrays, are presented.

	Abstract #300189
The views expressed here are those of the individual authors and not necessarily those of the ASA or its board, officers, or staff. Back to main JSM 2002 Program page

Abstract #300189