JSM 2005 - Toronto

Abstract #303653

This is the preliminary program for the 2005 Joint Statistical Meetings in Minneapolis, Minnesota. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2005); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.



The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


The Program has labeled the meeting rooms with "letters" preceding the name of the room, designating in which facility the room is located:

Minneapolis Convention Center = “MCC” Hilton Minneapolis Hotel = “H” Hyatt Regency Minneapolis = “HY”

Back to main JSM 2005 Program page



Legend: = Applied Session, = Theme Session, = Presenter
Activity Number: 226
Type: Contributed
Date/Time: Tuesday, August 9, 2005 : 8:30 AM to 10:20 AM
Sponsor: Biometrics Section
Abstract - #303653
Title: Searching Solution Space for SELDI-TOF Cancer Classifiers
Author(s): Eric Siegel*+
Companies: University of Arkansas for Medical Sciences
Address: 4301 West Markham Street, Little Rock, AR, 72205, United States
Keywords: SELDI ; proteomics ; classification ; cancer ; mass spectrometry
Abstract:

To develop a classification model, we typically divide the data in half to train the model on one half and test it on the other. But when the training phase employs automated variable selection on a large number of variables, as occurs in cancer proteomics, the result is a model whose "solution" (predictors plus parameter estimates) may depend on the half to which the samples are allocated. To see how big a problem this is, I recently iterated the division, training, and testing process 10,000 times to get a distribution of solutions for classifying pancreatic cancer via logistic regression with Forward Selection on intensities of 37 peaks from the SELDI-TOF mass spectra of 103 serum samples. Although classifiers contained from three to seven peaks, every peak was selected at least once (min=12), and no peak was always selected (max=9279), implying that the peaks chosen by logistic regression with Forward Selection constitute a nonrobust and unreliable panel for classification. From each sample's Prob(Cancer|iteration), I developed two distribution-wide measures of classification performance and used them to identify a robust panel of SELDI peaks for cancer classification.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2005 program

JSM 2005 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2005