CS848 Fall 2019

CS848: PRIVACY & FAIRNESS IN DATA SCIENCE

Course Overview

Instructor: Xi He (office hours by appointment, DC3347 — subject to change without notice)
Lectures: Tue: 3:00pm - 5:50pm (DC2568)
LEARN for additional reading materials and submissions (remember to enroll)

This course focuses on the design of data science algorithms and characterizing properties of privacy and fairness. Our daily lives are actively being monitoring on web browsers, social networks, wearable devices and even robots. These data are routinely analyzed by statistical and machine learning tools to infer aggregate patterns of our behavior (with applications to science, medicine, advertising, IoT, etc). However, these data also contain private information about us that we would not like revealed to others (e.g., medical history, sexual orientation, locations visited, etc.). Disclosure of such data leads to our privacy being breached. Moreover, acting on data with such sensitive information could result in physical or financial harm, and discrimination, leading to issues in fairness. A grand challenge that pervades all fields of computer science (and more generally scientific) research is: how to learn from data collected from individuals while provably ensuring that (a) private properties of individuals are not revealed by the results of the learning process, and (b) the decisions taken as a result of data analysis ensure fairness. In this course, we will study recent work in computer science that mathematically formulates these societal constraints. For privacy, we will study differential privacy, a breakthrough privacy notion: an algorithm is differentially private if its output is insensitive to (small) changes in the input. Differential private algorithms have found applications in developing algorithms with provable privacy guarantees while learning from data from varied domains (e.g., social science, medicine, communications) and in varied modalities (e.g., tables, graphs, streams), and is used by government organizations and internet corporations like Google and Apple to collect and analyze data. Differential privacy has also been shown to help design better learning algorithms. We will also investigate how to formulate the notion of fairness in data analysis mathematical. We will study both the theory and practice of designing private and fair data analysis algorithms and their applications to data arising from real-world systems.

Prerequisites: The course is open to interested graduate students with sufficient mathematical maturity. Basic knowledge in algorithms, proof techniques, and probability will be assumed. Familiarity with databases and machine learning would help but is not necessary.

The course is currently listed in the Databases areas.

Format:

There are no exams
There are 4 modules: (I) Intro to Privacy, (II) Intro to Fairness, (III) Paper Reading for Privacy and Fairness. Modules I-II: Instructors will lead the discussion with lectures, and in-class exercise/projects will be involved. Modules III: Students read a paper and lead the discussion in the class.

Graded Student Work:

Module I-II: Students will work on one in-class mini-project per module in groups of 1-2. Individual reports shall be submitted after class.
Module III: Each student will individually submit a mini-critique for 10 papers that will be discussed in class. A mini-critique is like a peer-review at a conference and the very least should have: (i) Summary: Motivation + Problem + Approach + Result, (ii) 3 Strengths, and (iii) 3 Weaknesses.
1-2 students will be assigned to lead the discussion of each paper.
Students will participate in an individual or group research project. Projects can focus on developing new theory/algorithms for privacy/fairness, or on implementing/adapting known algorithms to a real application setting. More information about project ideas is forthcoming.
All submissions will be via LEARN.

Grading:

In class mini-projects: 10% each
Mini-critiques: 10%
Class participation and presentation: 20%
Project: 50%

Schedule

DATE	TOPIC	READING
Sep 10	Introduction (slides) Module I: Intro to Privacy Privacy attacks practicum (slides) + in-class exercise (pdf) Differential privacy & basic algorithms (slides)	Machanavajjhala, Kifer, "Designing Statistical Privacy For Your Data", CACM 2015 (PDF) Dinur, Nissim, "Revealing Information while Preserving Privacy", PODS 2003 (PDF) Dwork, Roth, "Algorithmic Foundations of Differential Privacy", Foundations and Trends 9(3-4), 2014 (PDF) Chapter 2 & 3 (Henceforth called the Privacy Textbook)
Sep 17	Designing complex algorithms & composition (slides) In-class mini project 1 (handout, files) (mini project 1 report due by Sep 23 noon)	DPComp.org Hay, Rastogi, Miklau, Suciu, "Boosting the Accuracy of Differentially Private Histograms Through Consistency", PVLDB 3(1) 2010 (PDF)
Sep 24	Discussion of mini project 1 Module II: Intro to Fairness Fairness practicum (slides) + in-class exercise (code) (deadline for choosing projects due by Oct 1 noon)	Angwin, Julia, "Machine Bias", Propublica 2016 (Article)
Oct 1	Fairness in ML I (slides) In-class mini project 2 (part i) (files)	Dwork, Cynthia, "Fairness Through Awareness", ITCS 2012 (PDF) Feldman, Michael, "Certifying and Removing Disparate Impact", KDD 2015 (PDF)
Oct 8	Fairness in ML II (slides) In-class mini project 2 (part ii) (files) (mini project 2 report due by Oct 14 noon) Paper Reading Instruction	Hardt, Moritz, "Equality of Opportunity in Supervised Learning", NIPS 2016 (PDF) Guidelines for Paper Reading & Writing a good critique (LEARN)
Oct 15	(reading week) (project proposal due by Oct 21)
Oct 22	Topic 1: privacy v.s. fairness (privacy and fairness, trade-offs, etc.)	Milklau et al, "Fair Decision Making using Privacy-Protected Data" ((PDF)) (nanantha) Fain, Brandon "The Core of the Participatory Budgeting Problem", WINE 2017 (PDF) (kknopf) Optional Reading: Ghodsi et. al, "Dominant Resource Fairness: fair allocation of multiple resource types", NSDI 2011 (PDF) Kleinberg et al, "Inherent Trade-Offs in the Fair Determination of Risk Scores", ITCS 2017 (PDF) (v3menon, s34agarw) Kifer, Machanavajjhala, "No Free Lunch in Differential Privacy", SIGMOD 2011 (PDF) (bkacsmar) Optional Reading: Kifer, Machanavajjhala, "Pufferfish", ACM TODS 2014 (PDF); He et al, "Blowfish Privacy", SIGMOD 2015 (PDF)
Oct 29	Topic 2: sources of bias	Pierson et al, "A large-scale analysis of racial disparities in police stops across the United States", (PDF) (jb2alawo) Caliskan et al, "Semantics Derived Automatically from Language Corpora Contain Human-like Biases", (PDF) (utdas) Torralba & Efros, "Unbiased Look at Dataset Bias" (PDF) (C447CHEN, jpremkum) Optional Reading: Bakshy et al. "Exposure to ideologically diverse news and opinion on Facebook" (PDF)
Nov 05	Topic 3: fairness mechanisms	Zemel et al, "Learning Fair Representations", JMLR 2013, (PDF) (z97hu) Bolukbasi et. al, "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings", NIPS 2016 (PDF) (D35KUMAR) Kusner et. al, "Counterfactual Fairness", NIPS 2017 (PDF) (mniknami, mkaregar) Optional Reading: Niki Kilbertus, "Avoiding Discrimination through Causal Reasoning", NIPS 2017 (PDF)
Nov 12	(mid-term report due by Nov 12 noon) Project day
Nov 19	Topic 4: private machine learning	Chaudhuri et al, "Differentially Private Empirical Risk Minimization", JMLR 2011 (PDF) (S3MOHAPA) Abadi et al, "Differentially Private Deep Learning", ACM CCS 2016 (PDF) (mrafuse, r5akhava) Optional Reading: Mironov, "Renyi DP" (PDF) Papernot et al, "Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data", ICLR 2017 (PDF) (t3humphr)
Nov 26	Topic 5: deployments of DP	Erlingsson et al, "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response", ACM CCS 2014 (PDF) (eeaton) Optional Reading: Bittau et al, "Prochlo: Strong Privacy for Analytics in the Crowd", SOSP 2017 (PDF); Erlingsson et al, "Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity", SODA 2019 (PDF) Kotsogiannis et al, "PrivateSQL", VLDB 2019 (PDF) (m2mazmud, harry) Optional Reading: Ge et al, "APEx: accuracy-aware DP data exploration", SIGMOD 2019 (pdf) Bater et al, "Shrinkwrap: Differentially-Private Query Processing in Private Data Federations", VLDB 2019 (PDF) (akassis)
Dec 03	Final presentation
Dec 10	(deadline for submitting final report)

Academic Integrity

Note that students are not generally permitted to submit the same work for credit in multiple classes. For example, if a student has reviewed or presented one of the papers in another seminar class, he or she should avoid reviewing or presenting it again for this class.

The general Faculty and University policy:

Academic Integrity: In order to maintain a culture of academic integrity, members of the University of Waterloo community are expected to promote honesty, trust, fairness, respect and responsibility. Check the Office of Academic Integrity's website for more information.
All members of the UW community are expected to hold to the highest standard of academic integrity in their studies, teaching, and research. This site explains why academic integrity is important and how students can avoid academic misconduct. It also identifies resources available on campus for students and faculty to help achieve academic integrity in — and out — of the classroom.
Grievance: A student who believes that a decision affecting some aspect of his/her university life has been unfair or unreasonable may have grounds for initiating a grievance. Read Policy 70 — Student Petitions and Grievances, Section 4. When in doubt please be certain to contact the department's administrative assistant who will provide further assistance.
Discipline: A student is expected to know what constitutes academic integrity, to avoid committing academic offenses, and to take responsibility for his/her actions. A student who is unsure whether an action constitutes an offense, or who needs help in learning how to avoid offenses (e.g., plagiarism, cheating) or about "rules" for group work/collaboration should seek guidance from the course professor, academic advisor, or the Undergraduate Associate Dean. For information on categories of offenses and types of penalties, students should refer to Policy 71 — Student Discipline. For typical penalties, check Guidelines for the Assessment of Penalties.
Avoiding Academic Offenses Most students are unaware of the line between acceptable and unacceptable academic behaviour, especially when discussing assignments with classmates and using the work of other students. For information on commonly misunderstood academic offenses and how to avoid them, students should refer to the Faculty of Mathematics Cheating and Student Academic Discipline Policy.
Appeals: A decision made or a penalty imposed under Policy 70, Student Petitions and Grievances (other than a petition) or Policy 71, Student Discipline may be appealed if there is a ground. A student who believes he/she has a ground for an appeal should refer to Policy 72 — Student Appeals.

Note for Students with Disabilities

AccessAbility Services, located in Needles Hall, Room 1401, collaborates with all academic departments to arrange appropriate accommodations for students with disabilities without compromising the academic integrity of the curriculum. If you require academic accommodations to lessen the impact of your disability, please register with AccessAbility at the beginning of each academic term.