Hi! My name is Shufan Zhang. I am currently a third-year PhD student in Computer Science at University of Waterloo (Waterloo, Canada), working on database security & privacy supervised by Prof. Xi He, as part of the Data System Group (DSG). I am looking for research internships in 2025/2026.
My research interests mainly lie in the field of computer security & data privacy, on both theory and system aspects, as well as its intersections with database systems and machine learning. I am so fortunate to work with many nice professors and talented colleagues, on projects related to machine learning, access control, location privacy, and others.
Selected Projects
-
Privacy Provenance for Multi-Analyst Differential Privacy and beyond
Integrating differential privacy (DP) into scenarios involving multiple data analysts or federated data owners often results in significantly higher accuracy loss than necessary, thereby limiting DP's real-world applicability. To address this issue, we propose privacy provenance as a novel lens through which to examine techniques aimed at improving the practical utility of differential privacy.
- with Xi He. "Differential Privacy with Fine-Grained Provenance: Opportunities and Challenges", in IEEE Data Engineering Bulletin (DEBull). [paper]
- with Haochen Sun, Karl Knopf, Shubhankar Mohapatra, Wei Pang, Calvin Wang, Yingke Wang, Masoumeh Shafieinejad, David Emerson and Xi He. "FedDPSyn: Federated Tabular Data Synthesis with Computational Differential Privacy.", in TPDP 2025 [paper]
- with Xi He. "DProvDB: Differentially Private Query Processing with Multi-Analyst Provenance", in Proc. ACM Manag. Data (SIGMOD 2024). [paper] [bibtex] [poster]
- with Le Yu, Yan Meng, Suguo Du, Yuling Chen, Yanli Ren and Haojin Zhu. "Privacy-preserving Location-based Advertising via Longitudinal Geo-indistinguishability", in IEEE Transactions on Mobile Computing, 2024. [link]
-
A preliminary version of this work appeared in TPDP 2022:
with Runchao Jiang and Xi He. "DProvSQL: Privacy Provenance Framework for Differentially Private SQL Engine". [short paper] [bibtex] [talk video]
-
Holistic Security in Cloud Data Management
Cloud service providers, such as Google Cloud, AWS, and Cisco Panoptica, offer numerous security- and privacy-enhancing mechanisms, including DP, TEE, encrypted query processing techniques, and fine-grained access control policies. However, inappropriate use or mixing of these techniques can lead to unintended information leaks. Our research develops methods to systematically evaluate these combined security measures, helping cloud users make informed choices when representing their sensitive data to the cloud.- with Xi He, Ashish Kundu, Sharad Mehrotra and Shantanu Sharma. "Secure Normal Form: Mediation Among Cross Cryptographic Leakages in Encrypted Databases", in ICDE 2024. [bibtex] [paper]
- with with Primal Pappachan, Xi He and Sharad Mehrotra. "Preventing Inferences through Data Dependencies on Sensitive Data", in IEEE Trans. on Knowledge and Data Engineering (TKDE). [paper] [bibtex]
- Tattle-Tale [GitHub]
-
Theories towards Robust Representation of Word Embeddings
[Tweetorial]
Word embeddings (e.g., Word2Vec, BERT, GPT) can be conceptualized as high-dimensional vectors encoding semantic or contextual meaning through their distances to certain semantic anchors. We investigate the distance recovery problem, a theory towards preserving Lipschitz continuity when embedding words into a vector space, ensuring the embedding is robust against small perturbations in word inputs.- with Zhuangfei Hu, Xinda Li, David P. Woodruff and Hongyang Zhang. "Recovery from Non-Decomposable Distance Oracles", in IEEE Trans. on Information Theory. [paper] [bibtex]
- A preliminary version of this work appeared in ITCS 2023:
with Zhuangfei Hu, Xinda Li, David P. Woodruff and Hongyang Zhang. "Recovery from Non-Decomposable Distance Oracles". [paper] [slides] [bibtex] [talk recording]
Teaching
-
Guest Lecturer:
- "Differentially Private Big Data Analytics and Machine Learning", CS 480/680, University of Waterloo (Spring 2023)
-
Teaching Assistant:
- CS 245 – Logic and Computation, University of Waterloo (Fall 2023.)
- CS 480/680 – Introduction to Machine Learning, University of Waterloo (Spring 2023, Winter 2024).
- CS 458/658 – Computer Security and Privacy, University of Waterloo (Winter 2023, Winter-Spring-Fall 2022).
- CS 115 – Introduction to Computer Science, University of Waterloo (Fall 2021).
- CS 338 – Computer Applications in Business: Databases, University of Waterloo (Spring 2021).
Misc.
If you want to know me, feel free to drop an e-mail on me. Minds are like parachutes — they only function when open. We may have fun talks and generate sparks of ideas. Discussions and recommendations of interesting books, movies, and researches are always welcome.
This page is still under construction.
Design Copyright © Shufan Zhang, 2019 - 2025
Last Modified: May. 9, 2025