More than 94% of current enterprises – including 83% of healthcare organizations – rely on cloud services, especially for their infrastructure and data storage needs. Organizations outsource their storage to third party cloud providers because of the high cost associated with owning and maintaining an on-premise storage or compute fleet. However, outsourcing an application’s data in plaintext can reveal sensitive information to a potentially nontrustworthy cloud provider. While encrypting the data forms the first obvious solution to ensure data privacy, a growing body of attacks exploit side-channel information such as access frequency or duration on encrypted data to uncover plaintext data. Such attacks are called inference or access pattern attacks. For example, the duration and frequency with which an oncologist accesses (encrypted) data can reveal the type of a patient’s cancer (e.g., based on the frequency and intervals of chemotherapy treatments).
In this course, we will study the design and implementation of private data systems that protect applications against access pattern attacks. We begin the course by reviewing the basic cryptographic encryption schemes, followed by understanding an encrypted database and the potential leakages in it as a motivation to learn about systems with stronger privacy guarantees. This course will cover data systems that use four types of cryptographic primitives: oblivious RAM, trusted hardware enclaves, secure multi party computation, and private information retrieval. Note that since this course is offered under data systems category, the discussions will focus primarily of the database design concepts.
The final grade for the course will be based on the following components:
Reviews (20%): Each week will have at least two papers that students are expected to read and write the reviews of. Each review should be no more than 500 words and contain the following sections, following the typical format of reviews in database conferences:
Week | Dates | Topic | Speaker | Readings |
---|---|---|---|---|
1 | Jan 9th | Intro | Sujaya Maiyya | Slides |
Jan 11th | Background: crypto and DB basics. | Sujaya Maiyya | Required: Textbook Chapter 0 and Chapter 2. Recommend reading other chapters as well. Slides |
|
2 | Jan 16th | Intro to ORAM | Sujaya Maiyya | Reading required but review not required: (1) PathORAM up to Section 4. (2) RingORAM. Slides |
Jan 18th | CrypDB | Student 1 | Required: CryptDB: protecting confidentiality with encrypted query processing | |
3 | Jan 23rd | Attack on CryptDB | Student 2 | Required: Inference Attacks on Property-Preserving Encrypted Databases |
Jan 25th | Concurrent ORAM (1) - Proxy based | Student 3 | Required: TaoStore: Overcoming Asynchronicity in Oblivious Data Storage | |
4 | Jan 30th | Concurrent ORAM (2) - Proxy-less | Student 4 | Required: ConcurORAM: High-Throughput Stateless Parallel Multi-Client ORAM |
Feb 1st | Transactions in ORAM | Student 5 | Required: Obladi: Oblivious Serializable Transactions in the Cloud | |
5 | Feb 6th | Replicated ORAM | Student 6 | Required: QuORAM: A Quorum-Replicated Fault Tolerant ORAM Datastore |
Feb 8th | Alternate techniques to ORAM (1) | Student 7 | Required: Pancake: Frequency Smoothing for Encrypted Data Stores | |
6 | Feb 13th | Alternate techniques to ORAM (2) | Student 8 | Required: Waffle: An Online Oblivious Datastore for Protecting Data Access Patterns |
Feb 15th | Intro to MPC (both circuit and secret sharing based) and TEEs | Sujaya Maiyya | Slides | 7 | Feb 20th | No class: Reading week |
Feb 22th | No class: Reading week | 8 | Feb 27th | Circuit based federated db | Student 9 | Required: SMCQL: Secure Querying for Federated Databases |
Feb 29th | Circuit based encrypted DB | Student 10 | Required: Arx: An Encrypted Database using Semantically Secure Encryption | 9 | Mar 5th | Secret sharing based DB | Student 11 | Required: Information-Theoretically Secure and Highly Efficient Search and Row Retrieval |
Mar 7th | TEE-based OLAP DB | Student 12 | Required: Opaque: An Oblivious and Encrypted Distributed Analytics Platform | 10 | Mar 12th | TEE-based oblivious DB | Student 13 | Required: ObliDB: Oblivious Query Processing for Secure Databases |
Mar 14th | TEE-based oblivious scalable DB | Student 14 | Required: Snoopy: Surpassing the Scalability Bottleneck of Oblivious Storage | 11 | Mar 19th | Intro to private information retrieval | Sujaya Maiyya | Slides |
Mar 21th | PIR based kv store | Student 15 | Required: Pantheon: Private Retrieval from Public Key-Value Store | 12 | Mar 26th | FSS-based PIR data system | Student 16 | Required: Splinter: Practical Private Queries on Public Data |
Mar 28th | Demo or paper (based on class size) | Demo or Student 17 | 13 | Apr 2nd | Project demos |
Apr 4th | Project demos |
Mental Health: If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support.
On-campus Resources
Off-campus Resources
Diversity: It is our intent that students from all diverse backgrounds and perspectives be well served by this course, and that students’ learning needs be addressed both in and out of class. We recognize the immense value of the diversity in identities, perspectives, and contributions that students bring, and the benefit it has on our educational environment. Your suggestions are encouraged and appreciated. Please let us know ways to improve the effectiveness of the course for you personally or for other students or student groups. In particular:
MOSS (Measure of Software Similarities) is used in this course as a means of comparing students' assignments to ensure academic integrity. We will report suspicious activity, and penalties for plagiarism/cheating are severe. Please read the available information about academic integrity very carefully.
Discipline cases involving any automated marking system such as Marmoset include, but are not limited to, printing or returning values in order to match expected test results rather than making an actual reasonable attempt to solve the problem as required in the assignment question specification.