Abstract:
|
Monothetic clustering is a clustering technique that has advantages in interpretability and prediction compared to other clustering techniques. It is based on recursive bipartitions of a data set by choosing splitting rules on the variables, one at a time. Circular variables, the type of variables whose values are defined within a range of values where the upper bound coincides with the lower bound (e.g., directional variables, time of day, etc.), must be treated differently from conventional quantitative variables. We suggest a clustering procedure for a data set that contains circular variables, starting from visualization, to calculating an appropriate distance matrix, to a clock's hand-type partitioning used in monothetic clustering. The suggested method is applied to an air particle counter data set measured in Antarctica. Using data measured every 15 minutes, the particle counts, wind speed, and wind direction are used to create meaningful rules to partition the multivariate data set. By doing that, we can identify groups of conditions for different patterns in sediment transport in the Taylor Valley in Antarctica.
|