Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 139 - Recent Advances of Semi-Supervised Learning: Techniques and Applications
Type: Invited
Date/Time: Tuesday, August 10, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #316722
Title: Optimal Semi-Supervised Estimation and Inference for High-Dimensional Linear Regression
Author(s): Yang Ning* and Jiwei Zhao and Heping Zhang
Companies: Cornell University and University of Wisconsin-Madison and Yale University
Keywords: Semi-supervised learning; Minimax optimality; High-dimensional inference; Sparsity; Model misspeci cation
Abstract:

There are many scenarios such as the electronic health records where the outcome is much more difficult to collect than the covariates. The data with the observed outcomes are called labeled, and those without the outcomes are referred to as unlabeled. In this paper, we consider the linear regression problem with such a data structure under the high dimensionality. Clearly, any supervised estimators can only use the labeled data. Our goal is to investigate when and how the unlabeled data can be exploited to improve the estimation and inference of the regression parameters in linear models. In particular, we address the following two important questions. (1) Can we use the labeled data as well as the unlabeled data to construct a semi-supervised estimator such that its convergence rate is faster than the supervised estimators (e.g., lasso and Dantzig selector)? (2) Can we construct confidence intervals or hypothesis tests that are guaranteed to be more efficient or powerful than the supervised estimators?


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program