Abstract:
|
Hierarchical Bayesian modeling facilitates sharing statistical strength across datasets. Consider multiple related regression problems. The classical paradigm of Lindley & Smith (1972) models effects as exchangeable across datasets but permits effects to be related across covariates. While this relatedness can be learned when datasets outnumber covariates, many analyses involve more covariates than datasets. In these cases, we can improve performance by modeling effects as exchangeable across covariates and learning relatedness of effects across datasets. E.g. in statistical genetics, we might regress dozens of traits (defining datasets) for thousands of individuals (responses) on up to millions of genetic variants (covariates). Moreover, we might wish to learn if certain traits are more related to each other than to others. We address these goals with a hierarchical model in which effects are conditionally i.i.d. across covariates, rather than across datasets. We devise an empirical Bayes estimator for this model, which determines and leverages the degrees of relatedness across datasets, has appealing theoretical and empirical properties, and outperforms existing approaches.
|