Modern large-scale statistical datasets may involve millions of observations and features. An efficient learning scheme is proposed by gradually removing variables based on a criterion and a schedule. The resultant algorithms build variable screening into estimation and the fact that the problem size keeps dropping throughout the iterations makes the scheme particularly suitable for big data learning. Theoretical guarantees of low statistical error are provided in the presence of design coherence. Experiments on real and synthetic data show that the proposed method compares very well with many state of the art methods in regression and classification while being computationally scalable.