In this talk we revisit the seminal work by Zou, Hastie and Tibshirani who formulated Sparse PCA as a regularized regression-type problem. While their algorithm is popular, the shortcoming is that their approach does not scale to big data. To ease the computational demands, we use the variable projection method which was developed by Golub and Pereyra in the 70's. Following this powerful paradigm to minimize two sets of variables, we recast the original problem as a value-function optimization problem. This new formulation allows us to a) incorporate a wide variety of sparsity promoting regularizers such as LASSO, ElasticNet or the ZeroNet and b) utilize robust formulations such as the Huber loss function. Next, we show how randomized methods for linear algebra can be leveraged to extend the approach to the large-scale data setting. The proposed algorithms are applied to both synthetic and real world data, showing exceptional computational efficiency and diagnostic performance.