Abstract:
|
Generalized Correlation Analysis (GCA) is a technique for finding highly correlation variation patterns across multiple datasets. It includes PCA and CCA as special cases. In this presentation, we discuss the estimation of leading sparse generalized correlation loading vectors in high dimensions, which can be called sparse GCA. Sparse GCA has found applications in multi-omics and multi-modal imaging where ambient dimensions of datasets are high. We first postulate a latent variable model in order to understand what the target of estimation is. Then we propose an efficient algorithm based on thresholded gradient descent of a non-convex objective function. With proper initialization, the algorithm is shown to achieve optimal estimation error rates. Time permitting, we also discuss the numerical performance of the algorithm on simulated and real datasets.
|