Abstract:
|
Linear regression is a fundamental modeling tool in statistics and related fields. In this paper, we study an important variant of linear regression in which the predictor-response pairs are partially mismatched. We use an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches. The combinatorial structure of the problem leads to computational challenges. In this paper, we propose and study a simple greedy local search algorithm. We prove that under a suitable scaling of the number of mismatched pairs compared to the number of samples and features, and certain assumptions on the covariates; our local search algorithm converges to a solution with residue bounded by a constant multiple of the squared norm of the noise. In particular, in the noiseless case, our local search algorithm converges to the global optimal solution with a linear convergence rate. We conduct numerical experiments to refine our understanding of the theoretical results and demonstrate the promising performance of our proposal versus existing approaches.
|