Abstract:
|
Two-Way structured data arise naturally in scientific applications such as microbiome, metabolomic and neuroimaging studies. These data often exhibit characteristics of: (i) extrinsic structure among the variables, such as phylogeny, pathway or network connectivity; and (ii) similarities among observations/samples that are informed by non-Euclidean measures, such as presence/absence.
In a recent paper, Allen and Taylor (2014, JASA) extended the approach of Escoufier (1977) and proposed the Generalized Matrix Decomposition (GMD) as a natural alternative to classical dimension reduction methods, such as principal component analysis (PCA), which do not account for the row-and-column structure in the data matrix. We describe extensions of this work to graphical exploration and regression modeling. We propose (i) the GMD biplot as an effective tool for exploratory data analysis in two-way structured data; (ii) GMD regression (GMDR) for high-dimensional regression with two-way structured data as an efficient estimation; and (iii) a high-dimensional inference framework for these models.
|