Zero-knowledge Deep Learning

What is this?

This project focuses on the development of specialized zero-knowledge proof (ZKP) protocols for deep learning (DL). Unlike the approach of feeding neural networks into off-the-shelf, general-purpose ZKP backends, our method:

  • Preserves tensor structures: This preservation allows for the potential parallelization of the proof generation process, which is essential for compressing overhead (especially proving times) to feasible levels.
  • Develops specialized protocols: We fully exploit the mathematical properties of various tensor operations in the design of their specialized proof protocols. This strategy aims to reduce the analytical overhead of the proof.
  • Implements CUDA-accelerations: We implement the protocols in CUDA to achieve a high degree of parallelization, enabled by the preserved tensor structures. This implementation is intended to reduce the empirical overhead of the proof.

Achievements

We have successfully developed the first operational ZKP schemes for:

  • Large Language Models (LLMs) with up to 13 billion parameters: Our system can generate proofs in just 15 minutes per inference. This accomplishment is made possible through zkLLM, which introduces tlookup for the efficient handling of non-arithmetic tensor operations and zkAttn for the attention mechanism. zkLLM ensures the privacy of model parameters and enables efficient zero-knowledge verifiable computations over LLMs.
  • Training Neural Networks of size 10 million: We have achieved a proving time of just 1 minute per update. This feat is accomplished through zkDL, which initially focuses on the verification of ReLU activation and its backpropagation (now superseded by zkLLM’s tlookup), and subsequently develops FAC4DNN for modeling neural networks as arithmetic circuits.

Future directions

  • Deep learning under fixed-point arithmetic: To apply cryptographic primitives, it is necessary to use fixed-point arithmetic instead of floating-point arithmetic. Specialized developments that adapt models to fixed-point arithmetic can help reduce overhead while preserving accuracy. This is closely related to, but different from, model quantization.
  • Implementation: Before any industrialization can take place, we believe that a torch implementation over finite fields and elliptic curves is necessary. This will lay the groundwork for the practical application of our ZKP system in real-world scenarios.

But unfortunately, I am no longer working on this…

Yes, this project has been terminated after more than a year of my struggles. The primary reason is that I cannot complete the future directions listed above on my own or with only a few collaborators (although they have been great). Additionally, the fundamental empirical overhead introduced by the cryptographic structures, despite my efforts to eliminate asymptotic overhead, remains a significant challenge compared to, for example, float16. Overcoming this would require revolutionary advancements in cryptography. My expertise lies more in machine learning, so this is likely not my responsibility.

Haochen Sun
Haochen Sun
Computer Science PhD Student

My research focuses on enhancing the security and privacy of machine learning and data management.