Abstract #301593


The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2002 Program page



JSM 2002 Abstract #301593
Activity Number: 410
Type: Contributed
Date/Time: Thursday, August 15, 2002 : 10:30 AM to 12:20 PM
Sponsor: Biometrics Section*
Abstract - #301593
Title: A Method to Identify Significant Clusters in Gene Expression Data
Author(s): Katherine Pollard*+ and Mark van der Laan
Affiliation(s): University of California, Berkeley and University of California, Berkeley
Address: School of Public Health, Earl Warren Hall #7360, Berkeley, California, 94720-7360, USA
Keywords: clustering ; silhouette ; homogeneity ; gene expression
Abstract:

Clustering algorithms have been widely applied to gene expression data. For both hierarchical and partitioning clustering algorithms, selecting the number of significant clusters is an important problem and many methods have been proposed. Existing methods for selecting the number of clusters tend to find only the global patterns in the data (e.g., the over and under expressed genes). We have noted the need for a better method in the gene expression context, where small, biologically meaningful clusters can be difficult to identify. We define a new criteria, Mean Split Silhouette (MSS), which is a measure of cluster heterogeneity. We propose to choose the number of clusters as the minimizer of MSS. In this way, the number of significant clusters is defined as that which produces the most homogeneous clusters. The power of this method compared to existing methods is demonstrated on simulated and real microarray data. The minimum MSS method is an example of a general approach that can be applied to any clustering routine with any global criteria. The key idea is to assess each cluster separately using a measure of heterogeneity.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2002 program

JSM 2002

For information, contact meetings@amstat.org or phone (703) 684-1221.

If you have questions about the Continuing Education program, please contact the Education Department.

Revised March 2002