|
Activity Number:
|
476
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Thursday, August 7, 2008 : 8:30 AM to 10:20 AM
|
|
Sponsor:
|
Section on Statistics and the Environment
|
| Abstract - #301222 |
|
Title:
|
A Model-Based Approach for Clustering Time Series of Counts
|
|
Author(s):
|
Sarah J. Thomas*+ and Bonnie Ray and Katherine B. Ensor
|
|
Companies:
|
Rice University and IBM T. J. Watson Research Center and Rice University
|
|
Address:
|
Dept of Statistics MS-138, Houston, TX, 77251-1892,
|
|
Keywords:
|
count time series ; model-based clustering ; observation-driven Poisson ; zero-inflated Poisson
|
|
Abstract:
|
Time series of counts arise in many applications, such as modeling product purchases for a customer or tracking abundance of biological species. It is often important to identify those series exhibiting similar behavior over time. We model the count series with an observation-driven Poisson regression model, which incorporates the autoregressive component of the series into the mean process of the Poisson. The time series are then clustered with a hierarchical amalgamative clustering algorithm using an empirical Kullback-Leibler distance metric. The fit of the data to a given model is calculated with a partial likelihood technique. The possibility of fitting and clustering zero-inflated models is also explored. A goodness-of-fit test for the zero-inflated models is discussed. The algorithm is applied to modeling and clustering air pollution data for the Houston metropolitan area.
|