JSM Preliminary Online Program
This is the preliminary program for the 2006 Joint Statistical Meetings in Seattle, Washington.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2006 Program page




Activity Number: 383
Type: Contributed
Date/Time: Wednesday, August 9, 2006 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Computing
Abstract - #307498
Title: A Scale-Independent Clustering Method with Automatic Variable Selection Based on Trees
Author(s): Samuel Buttrey*+
Companies: Naval Postgraduate School
Address: Code OR Sb, Monterey, CA, 93950,
Keywords: cluster quality ; classification and regression trees ; prediction strength
Abstract:

Clustering techniques usually rely on measurements of distances (or dissimilarities) among observations and clusters. These distances are often affected by variables' scaling or transformation, and do not provide for selection of "important" variables. We fit a set of regression or classification trees; each variable acts in turn as the response variable. Points are "close" to one another if they tend to appear in the same leaves of these trees. Trees with poor predictive power are discarded. "Noise" variables which appear in none of the trees have no effect on the clustering and can be ignored. The clustering is unaffected by linear transformations of the continuous variables and resistant to monotonic ones. Categorical variables are included automatically. We demonstrate the technique on well-known noisy data sets. This paper updates an idea proposed at JSM 2004.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2006 program

JSM 2006 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised April, 2006