Online Program Home
My Program

Abstract Details

Activity Number: 294 - Epidemiologic Methods for the Re-Use of Existing Data
Type: Topic Contributed
Date/Time: Tuesday, July 31, 2018 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistics in Epidemiology
Abstract #328826
Title: Data Integration for the Simultaneous Estimation of Normal Means
Author(s): Sihai Dave Zhao*
Companies: University of Illinois at Urbana-Champaign
Keywords: Compound decision problem; Data integration; Integrative genomics; Nonparametric empirical Bayes
Abstract:

The integrative analysis of disparate datasets is an important strategy in data analysis. It is increasingly popular in the field of genomics, which enjoys a wealth of publicly available datasets that can be compared and contrasted, or combined with new data, to extract novel scientific insights. This paper studies a simple but non-trivial example of data integration: leveraging an auxiliary sequence of side information for the simultaneous estimation of a vector of normal means. This task is formulated as a compound decision problem, an oracle integrative decision rule is derived, and a data-driven estimate of this rule, based on minimizing a SURE estimate of the oracle risk, is proposed. The data-driven rule is shown to asymptotically achieve the minimum possible risk among all separable decision rules, and its good performance is demonstrated in numerical properties. The proposed method leads naturally to an integrative high-dimensional classification procedure, which is shown to be capable of outperforming non-integrative methods in problems in genomics.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program