CS848 (Fall 2024): Privacy for Data Analysis and ML

Course Overview

This seminar-style course delves into the multifaceted landscape of privacy within data analysis and machine learning. We explore the intricate interplay between privacy policies, regulations, and cutting-edge privacy-preserving technologies, emphasizing their practical application in user-centric design. The course is structured around four core modules, each examining a distinct facet of privacy through different research methodologies:

Students will be expected to write reviews of published research papers from the field, present the paper to the class in a research seminar presentation style, and execute a novel research project they will write up and present at the end of the term.

Prerequisites: The course is open to interested graduate students with sufficient mathematical maturity. Basic knowledge in algorithms, proof techniques, and probability will be assumed. Familiarity with databases and machine learning would help but is not necessary.

The course is currently listed in the Databases areas.

Format:

Graded Student Work:

Schedule

DATE TOPIC RECOMMENDED READINGS
Week 1 (Sep 5) Introduction (slides)
Lecture 1: Empirical Privacy (slides)
In-class exercise (pdf)
A. Narayanan, V. Shmatikov, "Robust deanonymization of Sparse Datasets (Netflix Prize Data)." IEEE SP 2008
K. Lefevre, D. DeWitt, R. Ramakrishnan, "Incognito: Efficient full domain k-anonymity". SIGMOD 2005
R. Shokri, M. Stronati, C. Song and V. Shmatikov, "Membership Inference Attacks Against Machine Learning Models". IEEE SP 2017
Week 2 (Sep 12) Lecture 2: Semantic Privacy (slides)
In-class exercise (files)
Cynthia Dwork and Aaron Roth, “The Algorithmic Foundations of Differential Privacy”: Chapters 2-3
Week 3 (Sep 19) Lecture 3: User Privacy and HCI (slides)
In-class exercise (see slides)
K. Fischer, I. Trummová, P. Gajland, Y. Acar, S. Fahl, & A. Sasse. “The Challenges of Bringing Cryptography from Research Papers to Products: Results from an Interview Study with Experts”. Usenix Security Symposium 2024
Week 4 (Sep 26) Lecture 4: Legal Privacy (slides)
In-class exercise (see slides)

Guest Lecture by Dr. Kris Shrishak

Reading x 1
(Choose project due)
Pamela J. Wisniewski and Xinru Page, Chapter 2 "Privacy Theories and Frameworks" in "Modern Socio-Technical Perspectives on Privacy" (open access link) A useful book for potential projects.

(L2) M. Nouwens, I. Liccardi, M. Veale, D. Karger, and L. Kagal, “ Dark patterns after the gdpr: Scraping consent pop-ups and demonstrating their influence ”. CHI 2020. [Sreepriya | Bihui]
Week 5 (Oct 3) Guest Lecture by Prof. Geoffrey Rockwell

Reading x 2
(Project proposal due)
(L1) T. Marjanov, M. Konstantinou, M. J´o´zwiak, and D. Spagnuelo, “ Data Security on the Ground: Investigating Technical and Legal Requirements under the GDPR” PoPETs 2023. [Qianqiu | Kerem]
(L3) N. Samarin, S. Kothari, Z. Siyed, O. Bjorkman, R. Yuan, P. Wijesekera, N. Alomar, J. Fischer, C. Hoofnagle, and S. Egelman, “ Lessons in vcr repair: Compliance of android app developers with the california consumer privacy act (ccpa)”. PoPETs 2023. [Prach | Nafis]
Week 6 (Oct 10) Reading x 3
(S1) A. R. Elkordy, J. Zhang, Y. H. Ezzeldin, K. Psounis, and S. Avestimehr, “ How Much Privacy Does Federated Learning with Secure Aggregation Guarantee?” PoPETs 2023. [Eric | Hauton]
(S2) F. Boenisch, C. M¨uhl, R. Rinberg, J. Ihrig, and A. Dziedzic, “Individualized PATE: Differentially Private Machine Learning with Individual Privacy Guarantees”. PoPETs 2023. [Qingyang | Rozhan]
(S3) X. Tang, R. Shin, H. A. Inan, A. Manoel, F. Mireshghallah, Z. Lin, S. Gopi, J. Kulkarni, and R. Sim, “Privacy-preserving in-context learning with differentially private few-shot generation”. ICLR 2024. [Zhengyuan | Anudeep]
Week 7 (Oct 17) (Reading Week)
Week 8 (Oct 24) Reading x 3
(U1) D. Franzen, S. Nu˜nez von Voigt, P. S¨orries, F. Tschorsch, and C. M¨uller-Birn, “ Am I Private and If So, how Many? Communicating Privacy Guarantees of Differential Privacy with Risk Communication Formats”. CCS 2022. [Sina | Yingke]
(U2) S. A. Horstmann, S. Domiks, M. Gutfleisch, M. Tran, Y. Acar, V. Moonsamy, and A. Naiakshina, “Those things are written by lawyers, and programmers are reading that.” mapping the communication gap between software developers and privacy experts”. PoPETs 2024. [Kerem | Muhammad]
(U3) P. G. Kelley, C. Cornejo, L. Hayes, E. S. Jin, A. Sedley, K. Thomas, Y. Yang, and A. Woodruff, “‘There will be less privacy, of course’: How and why people in 10 countries expect AI will affect privacy in the future”. SOUPS 2023. [Bella | Prach]
Week 9 (Oct 31) Reading x 3
(E1) D. Ding, Y. Wang, G. Wang, D. Zhang, and D. Kifer, “Toward detecting violations of differential privacy”. CCS 20218. [Hauton | Alireza]
(E2) T. Humphries, S. Oya, L. Tulloch, M. Rafuse, I. Goldberg, U. Hengartner, and F. Kerschbaum, “Investigating Membership Inference Attacks under Data Dependencies". CSF 2023 [Dongfu | Lucas]
(E3) N. Papernot and T. Steinke, "Hyperparameter tuning with renyi differential privacy". ICML 2022. [Rozhan | Zhengyuan]
Week 10 (Nov 7) Project week
(Project mid-term report)
Week 11 (Nov 14) Reading x 3
(S4) E. Bagdasaryan, O. Poursaeed, and V. Shmatikov, “Differential Privacy has Disparate Impact on Model Accuracy”. NeurIPS 2019. [Bihui | Sreepriya]
(S5) L. Rosenblatt, B. Herman, A. Holovenko,W. Lee, J. Loftus, E. McKinnie, T. Rumezhak, A. Stadnik, B. Howe, and J. Stoyanovich, “Epistemic parity: Reproducibility as an evaluation metric for differential privacy”. ACM SIGMOD Record, 2024. [Arthur | Gengyi]
(S6) R. Ashmead, R. Cumings-Menon, et al., “The 2020 census disclosure avoidance system topdown algorithm”. Harvard Data Science Review, 2022. [Muhammad | Eric]
Week 12 (Nov 21) Reading x 3
(U4) N. Agrawal, R. Binns, M. Van Kleek, K. Laine, and N. Shadbolt, “Exploring Design and Governance Challenges in the Development of Privacy-Preserving Computation” CHI 2021. [Alireza | Bella]
(U5) L. Qin, A. Lapets, F. Jansen, P. Flockhart, K. D. Albab, I. Globus-Harris, S. Roberts, and M. Varia, “From Usability to Secure Computing and Back Again”. SOUPS 2019. [Nafis | Sina]
(U6) G. Sandoval, H. Pearce, T. Nys, R. Karri, S. Garg, and B. Dolan-Gavitt, “Lost at c: A user study on the security implications of large language model code assistants”. USENIX Security 23. [Anudeep | Qianqiu]
Week 13 (Nov 28) Reading x 3
(E4) N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tramer, “Membership inference attacks from first principles”. SP 2022. [Lucas | Qingyang ]
(E5) V. Feldman, “Does learning require memorization? a short tale about a long tail”. SIGACT 2020. [Yingke | Dongfu]
(E6) S. Yeom, I. Giacomelli, M. Fredrikson, and S. Jha, “Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting” CSF 2018. [Gengyi | Arthur]
Week 14 (Dec 5) Project final presentation
Week 15 (Dec 9) (Project final report)

Academic Integrity

Note that students are not generally permitted to submit the same work for credit in multiple classes. For example, if a student has reviewed or presented one of the papers in another seminar class, he or she should avoid reviewing or presenting it again for this class.

The general Faculty and University policy:

Note for Students with Disabilities

AccessAbility Services, located in Needles Hall, Room 1401, collaborates with all academic departments to arrange appropriate accommodations for students with disabilities without compromising the academic integrity of the curriculum. If you require academic accommodations to lessen the impact of your disability, please register with AccessAbility at the beginning of each academic term.