Activity Number:
|
34
- Advanced Methods in Statistical Learning
|
Type:
|
Contributed
|
Date/Time:
|
Sunday, August 7, 2022 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Learning and Data Science
|
Abstract #322662
|
|
Title:
|
Data Integration via Analysis of Subspaces (DIVAS)
|
Author(s):
|
Jack B. Prothero* and Jan Hannig and J. S. (Steve) Marron and Quoc Tran-Dinh and Meilei Jiang
|
Companies:
|
National Institute of Standards and Technology and University of Noerth Carolina at Chapel Hill and UNC and University of North Carolina Chapel Hill and Meta
|
Keywords:
|
Data Integration;
Principal Angles;
Multi-Block;
Partially-Shared
|
Abstract:
|
Modern data collection in bioinformatics and other big-data paradigms often incorporates traits derived from multiple different points of view of the observations. We call this data multi-view or multi-block data. The emergent field of data integration develops and applies new methods for studying multi-block data and identifying how different data blocks relate and differ. One major frontier in contemporary data integration research is methodology that can identify partially-shared structure between sub-collections of data blocks. This work presents our new method on this frontier: Data Integration Via Analysis of Subspaces (DIVAS). DIVAS combines new insights in angular subspace perturbation theory with recent developments in matrix signal processing and convex-concave optimization into one algorithm for parsing partially shared structure. Our novel approach based on principal angles between subspaces provides built-in inference on the results of the analysis, and is effective even in high-dimension-low-sample-size (HDLSS) situations.
|
Authors who are presenting talks have a * after their name.