CS848 Fall 2024

CS848 (Fall 2024): Privacy for Data Analysis and ML

Course Overview

Instructor: Xi He (office hours by appointment), co-design with Bailey Kacsmar (University of Alberta)
Lectures/Paper presentation + Discussion (Thur 1:00pm-3:50pm, DC 2568)
LEARN for additional reading materials, recorded videos, submissions, and grading
Piazza for questions and discussion

This seminar-style course delves into the multifaceted landscape of privacy within data analysis and machine learning. We explore the intricate interplay between privacy policies, regulations, and cutting-edge privacy-preserving technologies, emphasizing their practical application in user-centric design. The course is structured around four core modules, each examining a distinct facet of privacy through different research methodologies:

Empirical Privacy: Investigating vulnerabilities in anonymous data and machine learning models.
Semantic Privacy: Exploring the theoretical foundations and algorithmic implementations of differential privacy.
User Privacy and HCI: Understanding user experiences and perceptions of privacy in the digital age.
Legal Privacy: Navigating the complex ethical, legal, and policy frameworks governing data use.

Students will be expected to write reviews of published research papers from the field, present the paper to the class in a research seminar presentation style, and execute a novel research project they will write up and present at the end of the term.

Prerequisites: The course is open to interested graduate students with sufficient mathematical maturity. Basic knowledge in algorithms, proof techniques, and probability will be assumed. Familiarity with databases and machine learning would help but is not necessary.

The course is currently listed in the Databases areas.

Format:

There are four modules: (1) Empirical Privacy; (2) Semantic Privacy; (3) User Privacy and HCI; (4) Legal Privacy.
Each modules include: (a) 1 lecture; (b) in-class exercises; (c) 3-6 sessions of student paper presentations and discussion

Graded Student Work:

Paper reviews, presentations, and discussion: 50% (detailed instruction)
- 20% Seminar style presentations as discussion lead (1-2 per term)
- 15% Paper reviews (15 papers across the term)
- 10% Class participation
- 5% Quality of feedback on peers
Group project: 50% (detailed instruction)
- 5% Project proposal
- 10% Mid-term project progress report
- 10% Project presentation
- 25% Final project report

Schedule

DATE	TOPIC	RECOMMENDED READINGS
Week 1 (Sep 5)	Introduction (slides) Lecture 1: Empirical Privacy (slides) In-class exercise (pdf)	A. Narayanan, V. Shmatikov, "Robust deanonymization of Sparse Datasets (Netflix Prize Data)." IEEE SP 2008 K. Lefevre, D. DeWitt, R. Ramakrishnan, "Incognito: Efficient full domain k-anonymity". SIGMOD 2005 R. Shokri, M. Stronati, C. Song and V. Shmatikov, "Membership Inference Attacks Against Machine Learning Models". IEEE SP 2017
Week 2 (Sep 12)	Lecture 2: Semantic Privacy (slides) In-class exercise (files)	Cynthia Dwork and Aaron Roth, “The Algorithmic Foundations of Differential Privacy”: Chapters 2-3
Week 3 (Sep 19)	Lecture 3: User Privacy and HCI (slides) In-class exercise (see slides)	K. Fischer, I. Trummová, P. Gajland, Y. Acar, S. Fahl, & A. Sasse. “The Challenges of Bringing Cryptography from Research Papers to Products: Results from an Interview Study with Experts”. Usenix Security Symposium 2024
Week 4 (Sep 26)	Lecture 4: Legal Privacy (slides) In-class exercise (see slides) Guest Lecture by Dr. Kris Shrishak Reading x 1 (Choose project due)	Pamela J. Wisniewski and Xinru Page, Chapter 2 "Privacy Theories and Frameworks" in "Modern Socio-Technical Perspectives on Privacy" (open access link) A useful book for potential projects. (L2) M. Nouwens, I. Liccardi, M. Veale, D. Karger, and L. Kagal, “ Dark patterns after the gdpr: Scraping consent pop-ups and demonstrating their influence ”. CHI 2020. [Sreepriya \| Bihui]
Week 5 (Oct 3)	Guest Lecture by Prof. Geoffrey Rockwell Reading x 2 (Project proposal due)	(L1) T. Marjanov, M. Konstantinou, M. J´o´zwiak, and D. Spagnuelo, “ Data Security on the Ground: Investigating Technical and Legal Requirements under the GDPR” PoPETs 2023. [Qianqiu \| Kerem] (L3) N. Samarin, S. Kothari, Z. Siyed, O. Bjorkman, R. Yuan, P. Wijesekera, N. Alomar, J. Fischer, C. Hoofnagle, and S. Egelman, “ Lessons in vcr repair: Compliance of android app developers with the california consumer privacy act (ccpa)”. PoPETs 2023. [Prach \| Nafis]
Week 6 (Oct 10)	Reading x 3	(S1) A. R. Elkordy, J. Zhang, Y. H. Ezzeldin, K. Psounis, and S. Avestimehr, “ How Much Privacy Does Federated Learning with Secure Aggregation Guarantee?” PoPETs 2023. [Eric \| Hauton] (S2) F. Boenisch, C. M¨uhl, R. Rinberg, J. Ihrig, and A. Dziedzic, “Individualized PATE: Differentially Private Machine Learning with Individual Privacy Guarantees”. PoPETs 2023. [Qingyang \| Rozhan] (S3) X. Tang, R. Shin, H. A. Inan, A. Manoel, F. Mireshghallah, Z. Lin, S. Gopi, J. Kulkarni, and R. Sim, “Privacy-preserving in-context learning with differentially private few-shot generation”. ICLR 2024. [Zhengyuan \| Anudeep]
Week 7 (Oct 17)	(Reading Week)
Week 8 (Oct 24)	Reading x 3	(U1) D. Franzen, S. Nu˜nez von Voigt, P. S¨orries, F. Tschorsch, and C. M¨uller-Birn, “ Am I Private and If So, how Many? Communicating Privacy Guarantees of Differential Privacy with Risk Communication Formats”. CCS 2022. [Sina \| Yingke] (U2) S. A. Horstmann, S. Domiks, M. Gutfleisch, M. Tran, Y. Acar, V. Moonsamy, and A. Naiakshina, “Those things are written by lawyers, and programmers are reading that.” mapping the communication gap between software developers and privacy experts”. PoPETs 2024. [Kerem \| Muhammad] (U3) P. G. Kelley, C. Cornejo, L. Hayes, E. S. Jin, A. Sedley, K. Thomas, Y. Yang, and A. Woodruff, “‘There will be less privacy, of course’: How and why people in 10 countries expect AI will affect privacy in the future”. SOUPS 2023. [Bella \| Prach]
Week 9 (Oct 31)	Reading x 3	(E1) D. Ding, Y. Wang, G. Wang, D. Zhang, and D. Kifer, “Toward detecting violations of differential privacy”. CCS 20218. [Hauton \| Alireza] (E2) T. Humphries, S. Oya, L. Tulloch, M. Rafuse, I. Goldberg, U. Hengartner, and F. Kerschbaum, “Investigating Membership Inference Attacks under Data Dependencies". CSF 2023 [Dongfu \| Lucas] (E3) N. Papernot and T. Steinke, "Hyperparameter tuning with renyi differential privacy". ICML 2022. [Rozhan \| Zhengyuan]
Week 10 (Nov 7)	Project week (Project mid-term report)
Week 11 (Nov 14)	Reading x 3	(S4) E. Bagdasaryan, O. Poursaeed, and V. Shmatikov, “Differential Privacy has Disparate Impact on Model Accuracy”. NeurIPS 2019. [Bihui \| Sreepriya] (S5) L. Rosenblatt, B. Herman, A. Holovenko,W. Lee, J. Loftus, E. McKinnie, T. Rumezhak, A. Stadnik, B. Howe, and J. Stoyanovich, “Epistemic parity: Reproducibility as an evaluation metric for differential privacy”. ACM SIGMOD Record, 2024. [Arthur \| Gengyi] (S6) R. Ashmead, R. Cumings-Menon, et al., “The 2020 census disclosure avoidance system topdown algorithm”. Harvard Data Science Review, 2022. [Muhammad \| Eric]
Week 12 (Nov 21)	Reading x 3	(U4) N. Agrawal, R. Binns, M. Van Kleek, K. Laine, and N. Shadbolt, “Exploring Design and Governance Challenges in the Development of Privacy-Preserving Computation” CHI 2021. [Alireza \| Bella] (U5) L. Qin, A. Lapets, F. Jansen, P. Flockhart, K. D. Albab, I. Globus-Harris, S. Roberts, and M. Varia, “From Usability to Secure Computing and Back Again”. SOUPS 2019. [Nafis \| Sina] (U6) G. Sandoval, H. Pearce, T. Nys, R. Karri, S. Garg, and B. Dolan-Gavitt, “Lost at c: A user study on the security implications of large language model code assistants”. USENIX Security 23. [Anudeep \| Qianqiu]
Week 13 (Nov 28)	Reading x 3	(E4) N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tramer, “Membership inference attacks from first principles”. SP 2022. [Lucas \| Qingyang ] (E5) V. Feldman, “Does learning require memorization? a short tale about a long tail”. SIGACT 2020. [Yingke \| Dongfu] (E6) S. Yeom, I. Giacomelli, M. Fredrikson, and S. Jha, “Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting” CSF 2018. [Gengyi \| Arthur]
Week 14 (Dec 5)	Class ends
Week 15 (Dec 9)	(Project final report)

Academic Integrity

Note that students are not generally permitted to submit the same work for credit in multiple classes. For example, if a student has reviewed or presented one of the papers in another seminar class, he or she should avoid reviewing or presenting it again for this class.

The general Faculty and University policy:

Academic Integrity: In order to maintain a culture of academic integrity, members of the University of Waterloo community are expected to promote honesty, trust, fairness, respect and responsibility. Check the Office of Academic Integrity's website for more information.
All members of the UW community are expected to hold to the highest standard of academic integrity in their studies, teaching, and research. This site explains why academic integrity is important and how students can avoid academic misconduct. It also identifies resources available on campus for students and faculty to help achieve academic integrity in — and out — of the classroom.
Grievance: A student who believes that a decision affecting some aspect of his/her university life has been unfair or unreasonable may have grounds for initiating a grievance. Read Policy 70 — Student Petitions and Grievances, Section 4. When in doubt please be certain to contact the department's administrative assistant who will provide further assistance.
Discipline: A student is expected to know what constitutes academic integrity, to avoid committing academic offenses, and to take responsibility for his/her actions. A student who is unsure whether an action constitutes an offense, or who needs help in learning how to avoid offenses (e.g., plagiarism, cheating) or about "rules" for group work/collaboration should seek guidance from the course professor, academic advisor, or the Undergraduate Associate Dean. For information on categories of offenses and types of penalties, students should refer to Policy 71 — Student Discipline. For typical penalties, check Guidelines for the Assessment of Penalties.
Avoiding Academic Offenses Most students are unaware of the line between acceptable and unacceptable academic behaviour, especially when discussing assignments with classmates and using the work of other students. For information on commonly misunderstood academic offenses and how to avoid them, students should refer to the Faculty of Mathematics Cheating and Student Academic Discipline Policy.
Appeals: A decision made or a penalty imposed under Policy 70, Student Petitions and Grievances (other than a petition) or Policy 71, Student Discipline may be appealed if there is a ground. A student who believes he/she has a ground for an appeal should refer to Policy 72 — Student Appeals.

Note for Students with Disabilities

AccessAbility Services, located in Needles Hall, Room 1401, collaborates with all academic departments to arrange appropriate accommodations for students with disabilities without compromising the academic integrity of the curriculum. If you require academic accommodations to lessen the impact of your disability, please register with AccessAbility at the beginning of each academic term.