Please note: This master’s thesis presentation will take place online.
Mahbod Majid, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Gautam Kamath
We are increasingly seeing machine learning algorithms being applied to privacy sensitive data, making it increasingly more important to devise algorithms that preserve privacy. Yet, even the most fundamental high dimensional statistical estimation tasks were not well understood under differential privacy. Namely, no efficient algorithms were known for mean estimation with optimal number of samples under pure differential privacy.
We introduce a novel approach to design efficient private algorithm for statistical estimation tasks based on the Sum of Squares hierarchy and the exponential mechanism. The Sum of Squares hierarchy is a convex programming method that has been successfully used to design efficient algorithms in robust statistics. The exponential mechanism has been widely used in differential privacy to design information theoretically optimal algorithms, which are often inefficient. We use these two approaches combined together to obtain efficient and information theoretically optimal algorithms. We apply this approach to mean estimation and learning Gaussian distributions and demonstrate that they are optimal. We demonstrate that this approach can be applied to many other problems captured by the Sum of Squares method, through a meta-theorem. Moreover, our algorithms draw strong connections between robustness and privacy.
We also provide information theoretical lower bounds to demonstrate that our approaches are statistically optimal. Technically we use packing lower bounds, however the novelty of our lower bounds is in capturing the high probability setting.