Abstract:
|
Functional data clustering is complicated due to the existence of phase variation. It is often considered as a nuisance and removed using curve registration. Curve registration requires a target to which a functional object is aligned, which is not available since the cluster membership is unknown. There is also a trade-off between flexible warping and clustering data into multiple clusters. The larger the phase variability removed through curve registration, the smaller the remaining variability in the data, which often leads to a smaller number of clusters. Consequently, the number of clusters based on the amplitude variability is not uniquely identified. We proposed an iterative method that performs simultaneous curve registration and clustering. We also proposed a unified criterion based on external information for selecting the number of clusters and warping penalty. The criterion is derived from the classification likelihood, evaluating the association of the cluster membership with an outcome variable, while penalizing the cluster uncertainty. We evaluated the method through simulation and applied it to the digital electrocardiographic data from the CRIC study.
|