Abstract:
|
Nodewise Lasso regression is used conventionally to estimate a sparse covariance matrix in a Gaussian graphical model by performing cross-validated Lasso regression of each variable using all others as predictors. However, for matrices with a large number of samples and variables, this approach can be extremely computationally expensive, especially when selecting the Lasso penalty parameter via k-fold cross-validation. We discover a strong relationship between the maximum absolute correlation and the cross-validated penalty parameter. This relationship is demonstrated in various simulation studies as well as real data examples. We harness this relationship to develop an approximation algorithm in selecting the cross-validated penalty parameter for each regression fit, enabling speedups on the order of 15-30 times. Finally, we investigate an alternative application of the nodewise Lasso regression to a matrix estimation problem and demonstrate the validity of selecting the model with the lowest cross-validation error.
|