JSM 2004 - Toronto

Abstract #302153

This is the preliminary program for the 2004 Joint Statistical Meetings in Toronto, Canada. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2004); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2004 Program page



Activity Number: 427
Type: Contributed
Date/Time: Thursday, August 12, 2004 : 10:30 AM to 12:20 PM
Sponsor: General Methodology
Abstract - #302153
Title: Standardization and Denoising Algorithms for Mass Spectra to Classify Whole-organism Bacterial Specimens
Author(s): Somnath Datta*+ and Glen Satten and Hercules Moura and Adrian Woolfitt and John Barr
Companies: University of Georgia and Centers for Disease Control and Prevention and Centers for Disease Control and Prevention and Centers for Disease Control and Prevention and Centers for Disease Control and Prevention
Address: Dept. of Statistics, Athens, GA, 30602,
Keywords: mass spectrometry ; normalization ; pre=processing ; Random Forest ; classification
Abstract:

Application of mass spectrometry in proteomics is a breakthrough in high-throughput analyses. Early applications have focused on protein expression profiles to differentiate amongst various types of tissue samples (e.g., normal vs. tumor). We use mass spectra to differentiate between whole-organism samples of bacteria. The raw spectra are similar to spectra of tissue samples, raising some of the same statistical issues (e.g., nonuniform baselines and higher noise associated with higher baseline), but are substantially noisier. As a result, new pre-processing procedures are required before these spectra can be used for statistical classification. We introduce novel pre-processing steps that can be used with any mass spectra. These include a standardization step and a denoising step. The noise level for each spectrum is determined using only data from that spectrum. After applying these preprocessing steps, we used the Random Forest program to classify 120 mass spectra into four bacterial types. The method resulted in extremely low prediction errors in the training samples and zero prediction error in a test dataset we created from the whole dataset.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2004 program

JSM 2004 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2004