|
Activity Number:
|
242
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Tuesday, August 8, 2006 : 8:30 AM to 10:20 AM
|
|
Sponsor:
|
Section on Physical and Engineering Sciences
|
| Abstract - #306845 |
|
Title:
|
Analysis of Handwritten ZIP Code Digits Using OBSTree
|
|
Author(s):
|
Atina Dunlap Brooks*+ and Jacqueline Hughes-Oliver
|
|
Companies:
|
North Carolina State University and North Carolina State University
|
|
Address:
|
Department of Statistics, Raleigh, NC, 27695,
|
|
Keywords:
|
data mining ; machine learning ; handwriting analysis ; prediction ; classification ; simulated annealing
|
|
Abstract:
|
The classification method OBSTree has been used to successfully analyze drug discovery datasets. One significant feature of these datasets is that they contain a small percentage of observations which have meaningful classes and a large number of uninteresting observations. This paper explores the algorithmic modifications necessary to perform an OBSTree analysis of a dataset in which there is no irrelevant class. The dataset used is a well known US postal zip code dataset which contains individual handwritten digits. Our results are compared to a variety of data mining methods, and the interpretability benefits of OBSTree are examined.
|