CS848: BUILDING PRIVACY-AWARE DATABASE SYSTEMS

Course Overview

This course focuses on the design and development of privacy-aware database systems. We will discuss the sprivacy requirements for database systems in different settings and cover the state-of-the-art tools that achieve these requirements, including differential privacy, secure computation, encryptions, TEE, etc. We will also show challenges in integrating these techniques together and demonstrate the design principles and optimization opportunities for these security and privacy-aware database systems.

The course is currently listed in the Databases areas.

Format:

Graded Student Work:

Schedule

DATE TOPIC RECOMMENDED READINGS
Week 1 (Jan 11 - Jan 15) Introduction
(lecture, slides)

  • “Privacy Changes Everything”. Rogers et al. Poly 2018 and DMAH 2019 link
  • “Understanding Database Reconstruction Attacks on Public Data”. Garfinkel et al. Commun. ACM 2019 link
  • “The Seven Sins of Personal-Data Processing Systems under GDPR ”. Shastri et al. HotCloud 2019 link
  • Week 2 (Jan 18 - Jan 22) Module I: Centralized Setting - part a
    (lecture, slides)

    Mini-Assignment 1 (pdf, tex)
    (due by Feb 1, 11pm)

  • PINQ: “Privacy Integrated Queries: An Extensible Platform for Privacy-Preserving Data Analysis”. Frank McSherry. SIGMOD 2009 link
  • Flex: “Towards Practical Differential Privacy for SQL Queries”. Johnson et al. VLDB 2018 link
  • “Differential Privacy Under Fire”. Haeberlen et al. SEC 2011 link
  • (Optional) "Algorithmic Foundations of Differential Privacy". Dwork, RothFoundations and Trends 9(3-4), 2014 (PDF) Chapter 2 & 3.
  • Week 3 (Jan 25 - Jan 29) Module I: Centralized Setting - part b

  • “PrivateSQL: a differentially private SQL query engine.” Kotsogiannis et al. VLDB 2019. link [z82xie]
  • “Airavat: Security and Privacy for MapReduce”. Roy et al. NSDI 2010 link [s2udayas]
  • “A Universal Platform for Video Analytics with Differential Privacy”. Wang et al. PoPets 2020 link [bndong]
  • (Optional) "GUPT: privacy preserving data analysis made easy". Mohan et al. SIGMOD 2012 link
  • (Optional) "Relationship privacy: output perturbation for queries with joins." Rastogi et al. PODS 2009 link
  • (Optional) “Differentially Private SQL with Bounded User Contribution” Wilson et al. PoPets 2019. link
  • Week 4 (Feb 1 - Feb 5) Module I: Centralized Setting - part b cont.

    (Project proposal due)
  • “Differentially Private Event Sequences over Infinite Streams”. Kellaris et al. VLDB 2014 link [ccovingt]
  • “Differential Privacy for Growing Database”. Cummings et al. NIPS 2018 link [elepert]
  • “Formalizing Data Deletion in the Context of the Right to be Forgotten” Garg et al. EUROCRYPT (2) 2020 link [s693zhan]
  • (Optional) “Understanding and benchmarking the impact of GDPR on database systems”. Shastri et al. VLDB 2020 link
  • (Optional) “PeGaSus: Data-Adaptive Differentially Private Stream Processing”. Chen et al. CCS 2017 link
  • (Optional) “Continuous Release of Data Streams under both Centralized and Local Differential Privacy”. Wang et a. NDSS 2021 link
  • Week 5 (Feb 8 - Feb 12) Module II: Federated Setting - part a
    (lecture, slides)

    Mini-Assignment 2 (pdf, tex)
    (due by Mar 1, 11pm)

  • "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response", ACM CCS 2014 link
  • “Crypte: Crypto-Assisted Differential Privacy on Untrusted Servers” Chowdhury et al. SIGMOD 2020. link
  • (Optional) “The Limit of 2-Party DP”. McGregor et al. FOCS 2010 link
  • (Optional) "DP-Cryptography: Marrying Differential Privacy and Cryptography in Emerging Applications". Wagh et al. CACM 2021 link
  • Week 6 (Feb 15 - Feb 19) (reading week)

    Week 7 (Feb 22 - Feb 26 ) Module II: Federated Setting - part b

  • “Prochlo: Strong Privacy for Analytics in the Crowd". Erlingsson et al. SOSP 2017 link [ssveitch]
  • “Orchard: Differentially Private Analytics at Scale”. Roth et al. OSDI 2020 link [x556li]
  • “Encrypted Databases for Differential Privacy”. Agarwal et al. PoPets 2019 link [ckomlo]
  • (Optional) "Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity". Erlingsson et al. SODA 2019 link
  • (Optional) "Honeycrisp: Large-Scale Differentially Private Aggregation Without a Trusted Core". Roth et al. SOSP 2019 link
  • (Optional) "Secure and Scalable Document Similarity on Distributed Databases: Differential Privacy to the Rescue". Schoppmann et al. PoPets 2020 link
  • Week 8 (Mar 1 - Mar 5) Module II: Federated Setting - part b cont.
    (non-linear queries)

  • "Collecting and Analyzing Data Jointly from Multiple Services under Local Differential Privacy", VLDB 2020 link [f5ebrahi]
  • “DJoin: Differentially Private Join Queries Over Distributed Databases”. Narayan et al. OSDI 2012 link [c656wang]
  • “Shrinkwrap: efficient SQL query processing in differentially private data federations.” Bater et al. VLDB 2018 link [squnaibi]
  • (Optional) "Composing Differential Privacy and Secure Computation: A case study on scaling private record linkage." He et al. CCS 2017 link
  • (Optional) "SAQE: Practical Privacy-Preserving Approximate Query Processing for Data Federations". Bater et al. VLDB 2020 link
  • (Optional) "On Distributed Differential Privacy and Counting Distinct Elements". Chen et al. https://arxiv.org/abs/2009.09604
  • Week 9 (Mar 8 - Mar 12) Module III: Cloud Setting - part a
    (lecture)

    Mini-Assignment 3 (due by next live session)

  • “CryptDB: Protecting Confidentiality with Encrypted Query Processing”. Popa et al. SOSP 2011 link
  • “Opaque: An Oblivious and Encrypted Distributed Analytics Platform”. Zheng et al. NSDI 2017" link
  • (Optional) “Leakage-Abuse Attacks against Order-Revealing Encryption”. Grubbs et al. S&P 2017 link
  • (Optional) “Inference attacks on property-preserving encrypted databases.” Naveed et a. CCS 2015 link
  • (Optional) “Leaky Cauldron on the Dark Land: Understanding Memory Side-Channel Hazards in SGX”. Want et al. CCS2017 link
  • Week 10 (Mar 15 - Mar 19) (Project mid-term report)
    Week 11 (Mar 22 - Mar 26) Module III: Cloud Setting - part b

  • “Big Data Analytics over Encrypted Datasets with Seabed.” Papadimitriou et al. OSDI 2016 link
  • “Leakage-Abuse Attacks against Order-Revealing Encryption”. Grubbs et al. S&P 2017 link [nvduddu]
  • “Partitioned Data Security on Outsourced Sensitive and Non-sensitive Data”. Mehrotra et a. ICDE 2019 link [a54bhati]
  • (Optional) "Intertwining Order Preserving Encryption and Differential Privacy". Arxiv 2020. link
  • (Optional) MONOMI: “Processing analytical queries over encrypted data”. Tu et al. PVLDB 2013 link
  • Week 12 (Mar 29 - Apr 2) Module III: Cloud Setting - part b cont.

  • “EnclaveDB: A Secure Database using SGX”. Priebe et al. S&P 2018 link [sy2zhao]
  • “StealthDB: a Scalable Encrypted Database with Full SQL Query Support”. Gribov et al. NDSS 2018 link [bdzimmer]
  • “EncDBDB: Searchable Encrypted, Fast, Compressed, In-Memory Database using Enclaves”. Fuhry et al. Arxiv 2020 link [y264luo]
  • (Optional) "ObliDB: Oblivious Query Processing for Secure Database". Eskandarian and Zaharia. VLDB 2020 link
  • Week 13 (Apr 5 - Apr 9) (Project final presentation)
    Week 14 (Apr 12 - Apr 14)

    Academic Integrity

    Note that students are not generally permitted to submit the same work for credit in multiple classes. For example, if a student has reviewed or presented one of the papers in another seminar class, he or she should avoid reviewing or presenting it again for this class.

    The general Faculty and University policy:

    Note for Students with Disabilities

    AccessAbility Services, located in Needles Hall, Room 1401, collaborates with all academic departments to arrange appropriate accommodations for students with disabilities without compromising the academic integrity of the curriculum. If you require academic accommodations to lessen the impact of your disability, please register with AccessAbility at the beginning of each academic term.