Abstract:
|
Datasets with large numbers of features are becoming increasingly available and important in every field of research and innovation, urging practitioners to develop scalable algorithms with fast convergence. The question of fast convergence in the classical problem of high dimensional linear regression has been extensively studied. Arguably, one of the fastest procedures in practice is Iterative Hard Thresholding (IHT). Still, IHT relies strongly on the knowledge of the true sparsity parameter $s$. In this paper, we present a novel fast procedure for estimation in the high dimensional linear regression. Taking advantage of the interplay between estimation, support recovery and optimization we achieve both optimal statistical accuracy and fast convergence. The main advantage of our procedure is that it is fully adaptive, making it more practical than state of the art IHT methods. Our procedure achieves optimal statistical accuracy faster than, for instance, classical algorithms for the Lasso. As a consequence, we present a new iterative hard thresholding algorithm for high dimensional linear regression that is minimax optimal, fast and adaptive.
|