Abstract:
|
We consider the problem of using high dimensional data residing on graphs ? = {G,E} defined by their domain of vertexes G and edge system E to predict a low-dimensional outcome variable, such as disease status. Many of these data have two key features including spatial smoothness and intrinsically low dimensional structure. We propose a simple solution based on a general statistical framework, called multiscale weighted principal component regression (MWPCR). In MWPCR, we introduce two sets of weights including importance score weights for the selection of individual features at each node of G and spatial weights for the incorporation of the neighboring pattern of E on the graph ?. We integrate the importance score weights with the spatial weights in order to recover the low dimensional structure of high dimensional data on ?. To gain a deep understanding of MWPCR, we systematically investigate the theoretical properties of MWPCR under a high-dimensional binary classification setting. We demonstrate the utility of our methods through extensive simulations and a real data analysis based on Alzheimer's disease neuroimaging initiative data.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.