JSM 2005 - Toronto

Abstract #304636

This is the preliminary program for the 2005 Joint Statistical Meetings in Minneapolis, Minnesota. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2005); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.



The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


The Program has labeled the meeting rooms with "letters" preceding the name of the room, designating in which facility the room is located:

Minneapolis Convention Center = “MCC” Hilton Minneapolis Hotel = “H” Hyatt Regency Minneapolis = “HY”

Back to main JSM 2005 Program page



Legend: = Applied Session, = Theme Session, = Presenter
Activity Number: 145
Type: Contributed
Date/Time: Monday, August 8, 2005 : 10:30 AM to 12:20 PM
Sponsor: Section on Survey Research Methods
Abstract - #304636
Title: Simulation Comparison of Variable Selection and Classification Methods
Author(s): Jun Liu*+ and Shiying Wu and Robert Morris and Seungho Huh and Jiantong Wang and James Raymer and Ye Hu and Larry Michael
Companies: RTI International and RTI International and RTI International and RTI International and RTI International and RTI International and RTI International and RTI International
Address: 3040 Cornwallis Rd, RTP, NC, 27709, United States
Keywords: linear classification ; nonlinear classification ; variable selection ; SVM ; GDA ; kernel-based methods
Abstract:

Finding the best method to partition subjects into homogeneous classes is a problem that is frequently encountered in the areas of chemometrics and bioinformatics. Several methods are compared for this purpose. They are Stepwise Linear Discriminant Analysis (SLDA), Canonical Discriminate Analysis (CDA), Stepwise Multinomial Logistic Regression (SMLR), Support Vector Machines (SVM), Generalized Discriminant Analysis (GDA), and Kernel Partial Least Squares (KPLS). Datasets are simulated by varying the number of classes, distance between the classes, number and distribution of the potential classifiers, level of correlation among the potential classifiers, and inherent nonlinearity. The results suggest that when the variables are normally distributed and linearly separable, SLDA and SVM are the best performers; when the data is nonlinear and variables are not normally distributed, SVM is the best performer; when the number of the variables are large compared to the sample size, linear methods can achieve a satisfying rate of correct classification in most cases; and when the curvature of the separating planes increases, linear methods are less satisfactory.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2005 program

JSM 2005 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2005