JSM 2004 - Toronto

Abstract #301013

This is the preliminary program for the 2004 Joint Statistical Meetings in Toronto, Canada. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2004); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2004 Program page



Activity Number: 59
Type: Invited
Date/Time: Sunday, August 8, 2004 : 6:00 PM to 7:50 PM
Sponsor: Section on Statistical Computing
Abstract - #301013
Title: Boosting Extensions to Find Anomaly Structure in Data
Author(s): Virginia L. Wheway*+
Companies: Boeing Company
Address: 2100 Lake Washington Blvd ., Renton, WA, 98056,
Keywords: data-mining ; noisy ; boosting ; machine learning ; classification ; cluster
Abstract:

Recent advances in data mining have led to the development of a method called "boosting." Instead of building a single model, boosting sees several models being built using weighted versions of the original data. These models are then combined into a single prediction model via voting. Studies have demonstrated that boosting leads to significantly lower prediction error on unseen data. This poster demonstrates how the method of boosting may be extended beyond its original aims of improved prediction. Simple plots of specific boosting statistics may be used as tools to detect noisy data and unearth structure within datasets. Whether or not this "suspect" data occurs in groups or as single observations may also be determined. An industrial dataset will be used to demonstrate the process, and show the power of detecting unknown clusters within datasets. Proposed extensions for this research include testing for the threshold of cluster size able to be detected. Research into the method's extension to continuous data would certainly have many opportunities for existing datasets in many sectors, particularly for data containing rare events and data usually considered to be noisy.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2004 program

JSM 2004 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2004