All Times EDT

Keywords: leverage scores, subsampling, linear regression, least squares
In the context of linear regression, we compare several subsampling algorithms in terms of their actual, rather than theoretical, computational time. The goal is to assess whether relatively sophisticated leverage-based subsampling methods provide as much information as random subsampling, when accounting for the time it takes to both form and analyze the subsample. For a small simulation, we find that analyzing a small fraction of the data using leverage-based methods takes as long or longer than analyzing the entire dataset. Work is ongoing, but this provides initial evidence that leverage-based subsampling is not currently practically viable.