Abstract:
|
In this work, we propose a robust and efficient transferring approach for high dimensional surrogate-assisted semi-supervised learning, which integrates labeled observations in the source population and leverages unlabeled observations in the target population simultaneously, to improve the learning accuracy in the target population. Specifically, we consider a covariate shift setting and employ two nuisance models, a density ratio model and an imputation model, to combine transfer learning and surrogate-assisted semi-supervised learning strategies organically and achieve triple robustness. Different from double robustness, even if both nuisance models are misspecified, when the transferred source population and the target population share enough similarities, our triply robust estimator can still partially utilize the source population, and it is theoretically guaranteed to be no worse than the target-only surrogate-assisted semi-supervised estimator with negligible errors. We apply our method to improve the learning accuracy of polygenic risk prediction for Type II diabetes in African American population with the European population as the source.
|