Abstract:
|
Tensor data in the form of multi-dimensional array arise from modern scientific studies such as computational biology, brain imaging analysis, process monitoring system. These data are intrinsically heterogeneous with complex dependencies and structures, yet are commonly analyzed by two-stage approaches that first apply ad-hoc dimension reduction methods and thereby lack statistical efficiency and can obscure important findings. Model-based clustering is a cornerstone of multivariate statistics and unsupervised learning. But existing methods and algorithms are not designed for tensor-variate samples. In this article, we propose a novel Tensor Envelope Mixture Model (TEMM) for simultaneous clustering and multiway dimension reduction of tensor data. The TEMM incorporates tensor structure-preserving dimension reduction into the classical Gaussian Mixture Models, and drastically reduces the number of free parameters and estimation variability. An EM-type algorithm is developed to obtain the likelihood-based estimators of the cluster means and covariances, which are jointly parameterized and constrained onto a series of lower dimensional subspaces known as the tensor envelopes.
|