Abstract:
|
We consider the situation where multivariate functional data has been collected over time at each of a set of sites. Our illustrative setting is bivariate, monitoring ozone and PM10 levels as a function of time over the course of a year at a set of monitoring sites. Our objective is to implement model-based clustering of the functions across the sites. Using our example, such clustering can be considered for ozone and PM10 individually or jointly. It may occur differentially for the two pollutants. More importantly for us, we allow that such clustering can vary with time.
We model each function at each site using Gaussian processes. We use dimension reduction to provide a stochastic process specification for the distribution of the collection of multivariate functions over the say n sites. We use the Dirichlet process to capture clustering, i.e., shared labeling of the functions across the sites. We use partitioning of the time scale to capture time-varying clustering. Though the functions arise in continuous time, clustering in continuous time is not of practical interest and, in addition, is extremely computationally demanding.
|