Abstract:
|
High-dimensional data are the norm in current statistical research. Gaining systematic knowledge from these data is a cumulative process that greatly benefits from integration of multiple studies and technologies, and that relies critically on methods of analysis. In this work we introduce the `` Bayesian Multi-study Factor Analysis'' (BMFA), a generalized version of Bayesian factor analysis able to handle multiple studies and to derive in a single analysis (1) factors that capture common information, shared across studies, and (2) study-specific factors. Our fundamental challenge is estimation of common features shared among studies and identifying the variation specific to each study. We use sparse modeling of high-dimensional factor loadings matrices, both common and specific, using shrinkage priors. We describe a computationally efficient algorithm to estimate the parameters and to select the number of relevant common and study-specific factors. We assess the operating characteristics of our method with simulation studies, and we present an application in ovarian cancer with four gene expression studies.
|