|
Activity Number:
|
415
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Wednesday, August 9, 2006 : 10:30 AM to 12:20 PM
|
|
Sponsor:
|
Section on Survey Research Methods
|
| Abstract - #306403 |
|
Title:
|
Cluster Analysis for Outlier Detection and Its Application in a Large-Scale Survey
|
|
Author(s):
|
Jianqiang Wang*+ and Jean D. Opsomer
|
|
Companies:
|
Iowa State University and Iowa State University
|
|
Address:
|
Department of Statistics, Ames, IA, 50011,
|
|
Keywords:
|
hierarchical agglomerative clustering ; outlier detection ; distance measures ; survey data collection
|
|
Abstract:
|
Cluster analysis is a popular data mining tool which helps researchers explore the structure of multi-dimensional data, find special groups in populations and seek associations between individual units. It can also be applied to detect unusual points in data. The National Resources Inventory is a longitudinal survey of natural resources information on nonfederal land in the US. One possible problem encountered during NRI data collection and processing is the existence of unusual observations and outliers. These observations need to be identified and evaluated for correctness, in order to ensure the quality of the NRI data. An exploratory study is conducted to investigate the use of clustering approaches for outlier detection in NRI. The performance of different hierarchical clustering methods is compared regarding their ability to isolate artificially constructed outliers.
|