Abstract:
|
We propose a regularization method called direction penalized principal component analysis (dPCA). This approach penalizes the first principal component, i.e., the direction of maximum variance of the data, for deviations away from some target direction. While the latter vector has an obvious interpretation in terms of a Bayesian prior, our main contributions lay elsewhere. In particular, we derive an optimal penalty parameter that, for any target, always reduces the asymptotic L2-loss function relative to that of the raw principal component. The optimal penalty parameter is determined solely from the data and an iterative algorithm efficiently computes the dPCA estimator. We prove our results by adopting a high-dimension, low sample size framework that is increasingly relevant for modern applications. To shed some insight into the dPCA estimator, we develop interesting connections to Ledoit-Wolf constant correlation shrinkage as well a recently proposed James-Stein estimator for the first principal component. We demonstrate the performance of dPCA by benchmarking against both of these estimators.
|