Activity Number:
|
197
|
Type:
|
Topic Contributed
|
Date/Time:
|
Tuesday, August 13, 2002 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Section on Statistical Computing*
|
Abstract - #300189 |
Title:
|
Advances in Predictive Data Mining
|
Author(s):
|
Jerome Friedman*+ and Jacqueline Meulman
|
Affiliation(s):
|
Stanford University and Leiden University
|
Address:
|
, , , ,
|
Keywords:
|
|
Abstract:
|
A dissimilarity measure for value-attribute data is proposed for use in cluster analysis. It assigns small dissimilarities to observation pairs that have close values on any subset of the attribute variables regardless of their values on the complement set of variables. Using this measure in conjunction with dissimilarity-based clustering algorithms encourages the detection of subgroups of observations that preferentially cluster on subsets of the variables. The relevant variable subsets for each individual cluster can be different and partially (or completely) overlap with those of other clusters. Enhancements for increasing sensitivity for detecting especially low cardinality groups clustering on a small subset of variables are discussed. Applications in several different domains, including gene expression arrays, are presented.
|
- The address information is for the authors that have a + after their name.
- Authors who are presenting talks have a * after their name.
Back to the full JSM 2002 program |