Abstract:
|
Traditional cluster analysis methods used in ordinal data, e.g. k-means, are mostly heuristic and lack statistical inference tools to compare among competing models. To address this, we have developed cluster models based on finite mixtures and applied them to the case of repeated ordinal data within a Bayesian setting. In particular, we present a hierarchical model with data at 3 levels: clusters, individuals and occasions; where only the latter two are observed. That is, we assume that individuals come from a finite mixture of latent clusters. To model the ordinal nature of the data, we use cumulative logit models that include time effects by cluster to account for the correlation between repeated occasions within the same individuals. In order to illustrate the model, we apply it to 2001-2010 self-reported health status (SRHS) status from the Household, Income and Labour Dynamics in Australia (HILDA). SRHS is an ordinal variable with 5 categories: poor, fair, good, very good and excellent; and is highly correlated within individuals. The data and resulting clusters are visualized using heatmaps.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.