Abstract:
|
Privacy-preserving data analysis is a rising challenge in contemporary statistics, as the privacy guarantees of statistical methods are often achieved at the expense of accuracy. In this talk, we investigate the tradeoff between statistical accuracy and privacy in mean estimation and linear regression, under both the classical low- dimensional and modern high-dimensional settings. A primary focus is to establish minimax optimality for statistical estimation with the differential privacy constraint. To this end, we find that classical lower bound arguments fail to yield sharp results, and new technical tools are called for. By refining the ``tracing adversary" technique for lower bounds in the theoretical computer science literature, we formulate a general lower bound argument for minimax risks with differential privacy constraints, and apply this argument to high-dimensional mean estimation and linear regression problems. We also design computationally efficient algorithms that attain the minimax lower bounds up to a logarithmic factor.
|