Abstract:
|
Conventional unsupervised clustering works on essentially unstructured feature vectors in feature spaces of fixed dimension. We describe a family of clustering models that classify time series of arbitrary, possibly unequal, lengths. Instead of a weighted Euclidean distance metric, the distance used becomes the one induced by a Kalman filter describing the behavior of a time series. Equivalently, the clustering scheme may be described by a Bayesian network in which a discrete mode variable controls a Kalman filter, so that each value of the mode variable implies choice of a different Kalman filter and a different pattern of time series evolution. The learning algorithm is a version of the EM algorithm in which the E step is done by a pass of the Kalman smoother, and the M step uses the expectations to update the filter parameters. An application to clustering time series observations from the Southern California Integrated GPS Network is presented. The SCIGN data comprises several years' daily position estimates from over 200 sensors at a precision of a few millimeters.
|