Abstract:
|
Heterogeneity is an important feature of modern data analysis and a central task is to extract information from the massive and heterogeneous data. In this paper, we consider aggregation of heterogeneous regression vectors under multiple high-dimensional linear models. We adopt the definition of maximin effect (Meinshausen, B{\"u}hlmann, AoS, 43(4), 1801--1830) and further define the maximin effect for a targeted population by allowing for covariate shift. A ridge-type maximin effect is introduced to balance reward optimality and statistical stability. To identify the maximin effect in high dimensions, we estimate the regression covariance matrix by a debiased estimator and construct the optimal weight vector for the maximin effect. The resulted estimator of the maximin effect is not necessarily asymptotic normal since the constructed weight vector might have a mixture distribution. We devise a novel sampling approach to construct confidence intervals for any linear contrast of maximin effects in high dimensions. The coverage and precision properties of the constructed confidence intervals are studied. The proposed method is demonstrated over simulations and a genetic data set on g
|