Speaker Archive: 2022

This is an archive of weekly meeting talks from 2022. See the speaker archives for the talks from the current year, and links to archives from other years.

This Year

Date Speaker Links from talk Summary
2022-01-13 Ian M  

A look at the year just past

  • 2021 was not a good year for privacy/security professionals
    • 30 000 website hacks every day
    • More DDOS
  • Ransomware attacks rampant
    • Largest ransom (that we know of) paid by Acer of $50M.
  • Frequency of successful attacks has lowered slightly in the last couple of years
  • People seem to be a common threat vector to exploit
    • Ransomware
    • Phishing
      • Covid (800M emails daily)
  • Business email compromise rising as a real phishing threat
  • Ransomware
    • 1/4 organizations are paying the ransom
    • Only 1% of victims did not get their data back after paying
      • Criminals are working hard to establish a reputation of being honest
  • Western Europe/North America are not the regions the most affected by mobile malware
    • Iran most affected (30% of phones)
  • Cybercrime: $10 trillion in losses per year globally
    • Phishing alone is $1.48 billion
  • The defence/protect market is now $40B/year in revenue
  • What’s holding back from dealing with threats?
    • Lack of budget
    • Lack of skilled experts
    • No 3rd party validation of how well the solutions are working
2022-01-13 Ted  

Quantum Computing and Shor’s Algorithm

  • Qubit: Quantum bit
    • Takes on a value between [0> and [1>
      • Surface of a sphere (not a line)
      • Using x for how close you are to 0/1 and theta to represent the angle
      • a[0> +b[1> : a,b in C, norm(a)^2 + norm(b)^2 = 1
  • When we have n qubits, we can write the system as a general superposition
    • More general than considering n independent qubits
    • The qubits are not independent from each other, we say the qubits are entangled
  • Measurement
    • Superpositions only last as long as we don’t observe them
    • As soon as they measured, it collapses
    • If we only measure some of the qubits, the remaining qubits are left in a state consistent with the others
  • Unitary transformation
    • Transformations are linear, invertible and are norm-preserving
      • Unitary matrices
    • Linearity: how a transformation works on any state by how it acts on the computational basis
  • Important gates:
    • Hadamard gate
    • Phase shift gate
      • Corresponds to a rotation of the qubit
    • Control gate
      • Only apply to the 1-bit, rotation
  • Quantum algorithms
    • Combination of gates and some input qubits, then measure the output
  • Factoring numbers
    • To factor a number N, it suffices to be able to find the order of an integer a (mod N).
      • Use Quantum Fourier transform
        • Turns [x> into a uniform superposition fo all with x encoded into the phase
      • Inverse QFT
        • Think of this as a phase estimation
  • Shor’s Algorithm
    • I am sorry, wasn’t able to take proper notes, was too much information too quickly.
2022-01-20 Ian G  

Fast fully oblivious compaction and shuffling

  • What are oblivious algorithms
    • An adversary looking at the execution cannot tell what the private data is
    • Operate on private data, on machine controlled by adversary (even OS)
  • SGX
    • External and protected memory
    • Protected memory not so protected
      • Someone with OS/BIOS can see what pages/cache lines you access
      • Memory is not so protected
  • Obliviousness
    • [External-memory, protected-memory, page, cache line, word]-oblivious + control flow oblivious
      • All together: fully oblivious
      • Full obliviousness is uncommon
    • Key tools: OSWAPs, FOAV
    • Can’t have conditional branches dependent on private data
  • Compaction: array of items
    • Some of the items are marked (red)
    • Put all the items marked on the left of the array, the rest (white) on the right
    • OS shouldn’t be able to say which items are marked and how many there are
  • Shuffling
    • OS should not know the original state
  • Oblivious shuffle + Non-oblivious Sort = Oblivious Sort
  • Random Labels + Oblivious Sort = Oblivious Shuffle
  • Fully oblivious compaction
    • ORCompact, covered by Sajin
    • 2-4x better and parallelisms much better
    • New: (1/2)nlog(n) vs Old: n(log(n)-2) OSWAPs
    • Note: not order-preserving
  • Fulling oblivious shuffling
    • State-of-the-art: Bitonic Shuffle (1968)
    • 2 new algorithms: Oblivious recursive shuffle (ORShuffle) and Streaming Bucket Oblivious Random Permutation (BORPStream)
    • ORShuffle always outperforms Bitonic Shuffle (~2x for small block sizes)
    • BORPStream outperforms both for large problem sizes (large N & large b)
    • BORPStream is also very useful for streaming scenario, where Phase 2 time is much faster than Bitonic and ORShuffle.
  • ORShuffle = Mark Half + ORCompact + Recurse Left and Right
    • (1/4)nlog(n)(log(n)+1) OSWAPs
    • Same as Bitonic Shuffles but don’t need to add random labels
    • So for small items, outperforms ~x2
  • BORPStream
    • Start with BORP
    • Try to make it fully oblivious in the obvious way with ORCompact
      • Terrible performance
    • Different approach: stream data items through the butterfly network as they arrive
    • Each buffer in the butterfly network is a Merge Split Node
      • Alternates between either output streams
      • Can use dummy packets
      • Perform OSWAPs to send packets the right way in oblivious way
    • When all items are in, flush buffers
    • Each item ends up in a random output bucket
    • Use ORCompact on each bucket to remove dummies
    • Use ORShuffle on the reals to randomize the buckets
    • Concatenate the buckets to get the output
    • Analytical performance
      • Phase 1: 2nsd
      • Phase 2: funny long math formula: [flush buffers]+[compact]+[shuffle]
  • Takeaways
    • Fully oblivious algorithms are vital to SGX
    • Need tools like FOAV
    • New oblivious compaction (ORCompact) and shuffling (ORShuffle and BORPStream) outperform prior work.
2022-01-27 Rasoul  

SMPC VS HE

  • Secure multi-party computation (SMPC), how it’s related to Homomorphic Encryption (HE) and a performance comparison
  • SMPC: broad term
    • Multiple parties each with private inputs a, b, c, d, e…. Want to compute f(a, b, c, d, e) such that they don’t learn more than the computation reveals
  • Applications
    • Yao’s Millionaire problem
      • Compare private numbers
    • Private Model Inference
      • Data (private)
      • Model (company IP)
    • Satellite Collision Detection
      • Trajectories (private)
  • SMPC
    • Correctness
    • Privacy
      • Fairness?
    • Assumptions
      • Collusion
      • Malicious Parties
    • Cryptographic Tools
      • Oblivious transfer
      • Garbled circuits
      • Zero-knowledge proofs
      • Secret sharing
      • Homomorphic Encryption
  • 2PC Example
    • All parties online
    • High Communication
      • O(n^2) communication for n parties
  • Homomorphic Encryption
    • Form of encryption
    • Perform computation data while encrypted
    • Evaluation functions using primitive operations (add, mult)
    • We focus on client-server protocols using HE.
  • Properties of HE
    • Cons
      • HE ciphertexts are huge (x6 up to x100)
      • Computation is slow
        • Compared to plaintext
      • No circuit privacy (by default)
    • Pros
      • No interaction required (low communication)
      • All computation on server
  • HE can be used for certain steps of MPC computations
  • SMPC: more communication, less computation, less collusion
  • HE: Less communication, more computation, more collusion
  • Takeaways
    • SMPC
      • Online players
      • Fairness
      • Non-collusion assumption
      • Harder security proofs
      • Equal contribution
    • HE
      • Client can go offline
      • No circuit privacy by default
      • No collusion assumption
      • Easier security proofs
      • Unequal contribution
2022-02-03 Hossam  

Blinded Computation

  • Scenario: outsourced computation
    • Goal: runt he server’s confidential code over client’s confidential data
    • Initial target: outsourced ML inference training
  • How can client avoid revealing data to the service provider
    • Fully-Homomorphic Encryption: slow
    • MPC: slow
  • Hardware-based isolation
  • Trusted Execution Environment
    • Side channel attacks are a challenge
      • Integrity guarantees
      • Limited confidentiality guarantees
    • Sealed-glass model of computation
      • Can look inside but not interfere
  • Solutions
    • Second TEE
      • Can protect system data
      • Can be fixed-function
      • Confidential processing on isolated processor
  • Cloud service, current TEEs: High performance, low confidentiality
  • FHE: High confidentiality, low performance
  • The approach
    • “Safe” streams of instructions don’t expose sensitive data
      • Allowed:
        • Computation on sensitive data
      • Prohibited
        • Writing to peripherals of sensitive data
        • Leaking of sensitive data
    • Method
      • Use taint-tracking to limit sensitive data to safe places
      • Use remote attestation to assure the client that taint-tracking is applied
    • Taint-tracking policy
      • Registers/memory have an associated “sensitive” bit
      • Ideal rule:
        • Sensitive (output) <- Sensitive (input) and depends (output on input)
  • System model
    • 1. Handshake (ind. remote attestation)
    • 2. Shared secret key
    • 3. Data import (inputs)
      • Sensitivity=1
      • Decrypt
    • 4. Safe (“blinded”) computation
      • Enforced by HW
    • 5. Data export (result)
      • Encrypt
      • sensitivity=0
  • Problem: So far, one sensitivity bit for many clients
    • Data can be sent to the wrong client
    • Need a different sensitivity domain for each client
  • Solution 1 Isolation by honest-but-curious server OS (e.g. ScL 4)
    • OS keep track of sensitivity domains
    • Only needs 1 sensitivity bit
  • Solution 2: Hardware support for multiple sensitivity domains
    • Hardware keeps track of sensitivity domain
    • Requires more memory & more complex
  • Emulation & RTL
    • Implemented in Spike and in Verilog (Ibex)
    • Tagged Memory
    • Approximation can be overridden for specific instructions
  • Taint-propagation rule must consider many different observable outputs
    • Not all these outputs can be marked as sensitive
  • Exceptions
    • Example: data-dependent exceptions
      • Division by zero
    • Instruction must not raise an exception on data-dependent conditions
    • Solution
      • Unconditional faults
      • Set a sensitive error flag
  • Flag-based error handling
    • Challenge: common error-handling patterns have some leakage
    • Solution: modify APIs to produce output and an error flag
  • Generating compliant code with LLVM
    • Creating compliant code by hand is error prone
    • Solution: taint-tracking compiler
  • Formal verification in F*
    • Goal: changes in sensitive state never affect non-sensitive state
2022-02-03 David  

SHeals and Heals: isogeny-based PKEs from a key validation method for SIDH

  • SIDH: post-quantum Diffie-Helmann
    • Exist problem where attacker can exploit when private key is re-used
      • Fixed by HealSIDH
  • SIDH
    • Start with elliptic curve
      • E/Fp, p=2^(e)3^(d)-1
      • Pa, Qa generate E[2^(e)]
      • Pb, Qb generate E[3^(f)]
    • Choose generators
    • Alice (A) and Bob (B) pick their isogenies
    • We get a commutative square along the maps
    • Shared secret is Eab = can be computed using Ea, Eb
    • Ea = E/(Pa + SaQa)
    • Eb = E/(Pb + SbQb)
  • GPST attack
    • Bob’s public key: Eb, Phy_b(Pa), Phy_b(Qa) + R2 where R2 in Eb(2) i.e. 2R2= infinity
    • Bob pretends R2 is part of his public key
      • R2 might interfere with Alice’s computation. (It might or might not disappear)
      • Can find out if Alice’s private key is odd or even
    • Can do attack more sophisticated to keep learning bits of Alice’s private key
    • Existing defences are very costly (x2 cost)
  • HealSIDH
    • Alter terms of the transaction
      • Give away different kind of information
    • E/Fp, p = 2^(2e)3^(2f) - 1
    • Give away additional public information
      • Ra, Sa, Rab, Sab
    • Bob’s new public key is Eb + some other stuff
    • Bob does not manipulate Eb (assumption), rest may be manipulated
  • Theorem: If Alice obtains the correct shared secret Lab, then Bob’s data was not manipulated
  • Bob can’t forge its value because he doesn’t know certain parts of Alice’s private key
2022-02-10 Hans  

Color My World feat. ARM Memory Tagging Extension

  • Tag memory to:
    • Track sensitive data (1)
    • Provide data-flow integrity (2)
    • Prevent memory errors (3)
  • (1) Prevent unsafe instructions from running
    • Also track taint in registers
  • (2) Check based on static information
    • Data-flow graph, type information
    • Writes/read instrumented to set/check tag
  • (3) Assign tags to pointers
    • Assign tags at run-time
    • Can use random tagging
  • ARM Memory Tagging Extension (MTE)
    • 4-bit tags
    • 16-byte tagging granularity
    • Stores address tag in pointer
    • Stores memory tag in tag memory
  • MemTag Sanitizer
    • Ostensibly the “intended use-case” of ARM MTE
      • Assign random tag on allocation
      • Mitigates both spacial and temporal memorial errors
      • Prevents some buffer overflows
      • Omits tagging of safe allocations based on static analysis
    • Weak adversary model
      • Relatively low entropy for statistical defence
      • No verification of address tags
  • Adversary Model
    • Strong adversary
      • Exploit a memory vulnerability
      • Read arbitrary memory
    • Implications
      • Tags known and no need to guess
        • Can repeat until tags match
      • Memory tags can be forged
        • PAuth can mitigate, but replay possible
      • What about MemTagSanitizer
        • Mostly statistical defence
  • Their Work
    • Deterministic tagging
      • Based on static analysis
      • Isolation of safe memory allocations
      • No reliance on confidentiality
    • Reserve default tag for safe allocations
      • Unsafe allocations tagged/untagged
    • Tag-forgery prevention
      • Explicitly (re)tag any unsafe pointers
  • Safety analysis
    • Find allocations that are
      • Memory safe
      • Functionally safe with tagging
      • Pointer safe
    • Implemented as two analysis passes
      • Local pre-function Function Pass
      • Global Module Pass
    • Extending Stack Safety Analysis
  • Tag forgery prevention
    • Explicitly unset safe tag when needed:
      • Loading from unsafe memory
      • Performing unsafe pointer arithmetic
    • Detecting unsafe memory at run-time?
      • Rely on tag forgery prevention
      • Only pointer to safe can have safe tag
  • Guarded allocations
    • Functionally safe or crashing allocations
      • Uses scalar-evolution
      • Analysis checks overflow properties
  • Pointer safe allocations
    • Most apply tag forgery prevention to pointers corrupted in safe allocations
  • Cost
    • Strong isolation of safe allocations
    • Performance overhead
    • Memory overhead
  • Future Work
    • Optimize instrumentation
    • Combine with different front-ends
    • Support language annotations
2022-02-10 Yousra  

Learning from APPs: Access control recommendation for framework APIs

  • Android access control model
    • Many Layers:
      • Bottom: Linux Kernel: Drivers
      • Working up: Framework Layer
      • Top: Application Layer:
        • Use APIs
          • Permission checks
        • JNI to talk to Linux Kernel
          • Access control
    • Is this access control enforcement effective?
      • No access control can lead to issues where apps can use components of others apps without limit
    • Android: Lack of an Oracle
      • Approximate Solution: Compare Access Control enforcement across multiple instances
    • Linux Kernel: Addicted’ S&P16
    • Framework Layer: AceDroid’ NDSS18
    • FReD Usenix’22
      • Compare AC proceeding an access to a file in the framework tot hat of the file DAC’s.
    • Inconsistencies between app and framework layers?
      • App less protected than framework
        • Hijacking vulnerability
      • Other way around
        • API is likely not correctly protected
  • Goal of this project:
    • Learn from APPs a better access control enforcement for framework APIs.
  • Can we learn from APPs?
    • Various AC checks
    • Investing AOSP
      • Assume it’s ground truth
      • Construct mappings between
        • APIs’ framework-level protection
        • APIs’ app-level protections
      • App-level protection could be of various types
        • Permissions
        • Exposure
        • UI interaction
      • Detect weakly protected APIs
      • As of now, the mappings are difficult to interpret
    • APPs protect APIs with
      • A protection equivalent or stronger than that of the framework
      • Or a UI block
  • Other challenges
    • Identifying the intended protection target is not straightforward
    • Solution
      • Majority voting
      • Name correlation via NLP analysis
2022-02-17 Asokan  

Contrast between research interest and industry interest

  • Five examples
    • Optimistic Fair exchange -> doctoral research
    • Generic Authentication Architecture
    • Channel Binding in Protocol Composition
    • Secure Device Pairing
    • On-board credentials
  • Fair Exchange
    • How can two mutually distrusting parties exchange digital “items” on the Internet?
    • Existing Solutions:
      • Trusted Third Party protocols
      • Gradual Exchange protocols
    • Design Choices
      • Common case: both want to complete the exchange
        • Efficient in common case
        • Allow recovery in case of exceptions
      • Requirements
        • Effectiveness
        • Fairness
        • Timeliness
        • (Non-invasive)
  • Optimistic fair exchange
    • Exchange permit before the exchange of items in case one of the parties does not do its part
    • Recovery protocol using the permits and the third party authority
  • Verifiable Encryption
    • Analogy: jewelry in a glass box: can see but can’t touch
    • Verifiable encryption of discrete logs
  • From verifiable encryption to permits
    • A-exp: desc of B_item
    • A-permit: verifiable encryption A-exp + A-item
  • Pre-paid coupons from the TTP to be used for every optimistic transaction
  • The aftermath
    • Someone has to run the third party
      • Monetize every transaction!
    • Two decades on, current status:
      • Reputation systems
      • In-line TTP (e.g., E-bay escrow service)
    • Continuing “impact” in research circles!
    • Impact in academy (success) vs real world impact (failure)
  • Lessons learned
    • Don’t just guess security requirements: Ask stakeholders
    • Desiderata for deployment and research can be different
      • “The more (independent) parties you require for your scheme, the less likely it will be deployed”
    • Capturing researcher interest does not always lead to a Tech Transfer Impact
      • MANETs anyone?
    • “90-10” rule applies to deploying security
      • “Good enough beats perfect”
2022-02-24 Urs  

Revisiting the Security of Biometric Authentication System Against Statistical Attacks

  • Goal
    • Statistical attacks exploit the overlaps in user’s behaviour across the population
    • They propose the first ever defence against diverse statistical attacks
  • Background
    • Authentication usually modelled as a binary classification or anomaly detection problem
    • Zero-effort attacker model
  • Statistical Attacks
    • Keystrokes behaviour analysis
      • Looked at the mean, standard deviation
      • Lego attack (CCS 2013/2016)
        • Used a lego robot to feed a “magic swipe” to a smartphone
  • Statistical attacks using Hypervolumes
    • Region of 3 or more dimensions in n-dimensional space
  • Attack Intuition
    • Rank overlapping regions based on amount of overlap, # of users per region, and # of samples in region
    • K-means clustering
  • Targeted biometrics
    • Touch input
    • Keystroke dynamics of physical keyboard
    • Mouse movement
    • Gait
    • Voice
  • Baselines Attacks
    • Master Key
    • Vanilla statistical attack
    • K-means++
    • Metric: Percentage of compromised population depending on attempt to attack
  • Results
    • Hypervolume better than the others
    • K-means++ second best
    • Voice is susceptible to statistical attacks
    • Touch and mouse do poorly against statistical attacks
  • Defence
    • Focus against ongoing attacks (at least two attempts)
    • Should distinguish between false rejects of legitimate user and statistical attack
    • Attacker has limited # of attempts
2022-02-24 Shannon  

Non-Malleable Secret Sharing

  • Secret Sharing
    • Security: Any authorized set of parties can reconstruct the secret
    • Privacy
  • Robustness
    • Adversary has access to a subset of the shares
    • E.g. uses a combinatorial structure: difference sets
  • Verifiability
    • Introduces verify algorithm which can be used to verify the validity of the shares
  • Non-malleable crypto
    • ~ early 2000’s
    • Impossible to generate a different cipher text such that the respective plaintext are related
  • Definition #1
    • Prevent any tampering of shares (verifiable by the shareholder)
    • Not a formal definition
    • Stronger notion than verifiability?
    • Share concatenated to the public key are encrypted
      • Provides the some level of confidentiality
  • Definition #2
    • Corrupt players cannot modify their shares without being caught during reconstruction
    • IPS-Malleability game
  • Definition #3
    • Secret sharing scheme recovers the original or something entirely unrelated
    • The adversary given t shares to modify wins if the reconstruction algorithm outputs a valid secret s’ not equal to s such that s’ ~ s
    • Independent tampering: t-non-communicating adversaries are given a single share each they can modify
2022-03-03 Jiayi  

Multi-user, multi-model Implicit Authentication for shared devices

  • Implicit Authentication (IA)
    • Authenticate a user transparently
    • Improve security and usability of user authentication
  • IA Schemes
    • Android Smart Lock
    • Behavioural Biometrics
      • Keystrokes dynamics
      • Etc …
  • General (ML-Based) IA workflow
    • Enrolment
    • Authentication
    • Update
  • Shared Mobile Devices (SMD)
    • Multi-user scenarios
      • Household sharing
    • Conflicts with the single-user design
    • Automatically detect user switch
  • Multi-user IA for SMD
    • Requires
      • Identify legitimate users
      • Reject attackers
  • Extension of IA mechanisms
    • Binary classification -> multi-class classification
  • Multi-model IA
    • Availability and accuracy problems of IA schemes
      • Relies on availability of its required device usage/behavioural pattern
  • Fusing multiple modalities
    • Fusion strategies
      • Feature level
      • Score level
      • Decision level
    • Take uncertainty into account
  • Dempster-Shafer Theory based function
    • Combine evidence from different sources and incorporate uncertainty into the score function
    • Apply Demster’s rule of combination to combine scores from different authenticators
  • Evaluation setup
    • HMOG dataset
      • Touch & gait
      • 3 legit users, 3 attackers
      • 3-min data clip for each user/attacker
      • 50 rounds
    • Methods for comparison
      • Single modality
      • Average
      • Weighted average: mean-EER, mean-AUC
      • Kalman filter
    • Metrics (Per-user)
      • False rejection rate (FRR)
      • False acceptance rate (FAR)
      • False identification rate (FIR)
      • Gini Coefficient (GC)
  • Results
    • DS methods have the best overall accuracy amongst the eight methods
    • Errors for DS methods concentrated on a small set of users
    • Accuracy improvement is bounded by the adopted modalities
2022-03-10 Bailey  

Towards meaningful consent for private computation

  • What is privacy?
    • Technical
    • Conceptual
    • Legal
    • Usable
  • Technical privacy risk?
    • Define, what is being protected, from who and under what conditions this protection will hold
  • Legal Privacy Risk
    • Who, can do what, under what conditions
    • Enforced by laws & sanctions
  • Data Subject Privacy Risk?
    • Financial
    • Professional
    • Societal
    • Safety
    • Right to privacy
  • Data Sharing Literature
    • Mobile permissions
    • Purpose/Type
    • Privacy Policies
    • Dark Patterns
    • Differential Privacy
  • Companies can now claim somewhat ownership of the data they collect on you and sell it to other companies (Google & MasterCard)
  • Building Private Computations
    • Trade-offs between privacy, accuracy and efficiency
  • Meaningful consent: three proposed perception vectors
    • Privacy Mechanisms
    • Sharing Structure
    • Perceived risk
  • Questions about sharing structures
    • How does the overall acceptability vary across different types of multiparty collaborations?
    • How does acceptability vary in multiparty data collaborations
  • Survey Overview
    • Participant set is N=916
    • Each participant received 1 out of 12 scenarios
    • 5-point semantic scale
    • Wording of statements have a strong influence on the user reactions (indefinitely vs in-use)
  • Lesson 1: Disambiguate third parties
  • Qualitative beginning: Interview Studies
    • Comprehension, consent and conditions
  • Lesson 2: Determine the influence of privacy guarantees versus risk
  • Lesson 3: Determine how to communicate to data subjects how their data is being protected
2022-03-10 Thomas  

Ants Forage Privately?

  • Differential Privacy
    • Exponential mechanism: depends on epsilon and the sensitivity
  • Ant Colony Optimization (ACO)
    • Graph problem originally
    • Meta-heuristic based on the foraging methods of ants
    • Walk around for food, when they find it, they bring some back and leave a pheromone trail whose strength depends on the amount and quality of the food they found.
  • Great drawings by Thomas
    • Ant fighting for privacy goes down the sadness trail
    • The better route will statistically be used more often
      • More pheromones
      • More usage
      • Positive feedback loop
  • Summary
    • Type of swarm intelligence
  • ACOR
    • Extension of ACO to the continuous search space
    • Solution is a vector in R^n
    • Pheromones are a table of N best solutions so far
    • Ranks are computed by evaluating the utility function F then sorting
  • Basic Algorithm
    • Random init of the pheromone store
    • In each generation, each of the M ants chooses a past solution based on the weights
    • New solutions are generated
    • New M best solutions are then sorted
    • Top N becomes the new pheromone store
  • Selection
    • Roulette wheel selection
  • Solution Creation
    • New solutions sampled one dimension at a time following the Gaussian distribution
  • Privacy
    • 2 steps that use the utility function (and thus the data)
      • Selection: thinks it is close enough to the exponential mechanism that we could get DP
      • Sorting and taking the top-k: could be made private or remove this step
2022-03-17 Meng  

Backward Symbolic Execution

  • Fast and Reliable Formal Verification of Smart Contract with the Move Prover
  • Forward Symbolic Prover
    • Path-based exploration
      • Transforms a path into a logic expression
        • Proves the logic expression
      • Path explosion
        • 2^l paths where l is the number of layers in the program
  • Backward symbolic execution
    • Why backwards?
      • Hide the path explosion problem behind the SMT solver with an explosion of variables
    • Convert program into a dynamic single assignment CDSAS
    • Do a topological sort
    • Walk-up process: recursive process
      • Walk backwards up the chain to get back B0 and then send B0 to the solver
  • Comparison
    • Forward: less free variables and much more human readable. However you need to prove it for all paths
  • Forward symbolic execution, again
    • Learned from backwards
      • Topological sorting is great
      • SMT solvers are capable of handling large sets of fresh variables
    • Can we use these to improve forward symbolic execution?
      • Walk-down process
  • For complex programs, SMT solver might not be able to process B0 -> go back to some Bk -> send it to the solver
    • Concretize a portion of the free variables
    • Repeat until B0 can be handled
  • This workflow applies also to Automated Exploit Generation (AEG)
  • Conclusion: SMT solvers are magic
2022-03-24 Parjanya  

Learning from APPs: Access control recommendation for framework APIs

  • Inconsistency Detection
    • What is lacking?
      • Comparing App level AC vs Framework level AC. Can we lean from the Apps?
      • Hypothesis: Yes!
  • Motivating Example - Samsung Note
    • New way to annoy you
    • DOS attack
    • State Leak
  • Indicators
    • Type of UI widget
    • Strings in the interactive widget
    • Strings in the contextual layout
    • Number of UI blocks
  • Manifest Protections
    • Apps declare components and how they can be accessed
      • Permission specifications
  • Overview
    • Statistically analyze system app to extract protections
      • Manifest level protection
      • UI based protection
    • Propose an oracle
  • Challenges
    • Manager Detection
      • Apps use a Manager class to interact with the framework API
        • Find the invokers
      • Invokers could also be on the server side
        • Need to filter them out
          • Exclude “service” dex files from the scope
    • Implicit interactions
      • Indirect Communications
        • Preprocess framework to add invocation edges
  • App-Analysis: Individual Components
    • Extracting entry points with app-side protections
    • A calligraphy for each individual app component
  • Finding the intended target: challenge
    • Protection applied at the onStart is intended for which API?
    • Same problem for Manager detection
    • Solution: Majority Voting
  • App-Analysis: Inter-component graph
    • Take AOSP as the ground truth
  • The TODOs
    • Implement majority voting
    • Analyze other AOSP system apps to uncover ground truths
2022-03-24 Owura  

Bad Characters: Imperceptible NLP attacks

  • NLP Models
    • Transformer-based encoder-decoder models: sequence-to-sequence
  • Adversarial NLP
    • Human Perceptible changes:
      • Input swaps with synonyms
      • Misspellings
      • Paraphrasing
      • Neighbours of input sequences in an embedded space
    • Human imperceptible changes?
      • Gap between visualization and NLP pipelines
        • Exploit glyphs
      • Invisible characters: characters with no visible glyph
        • U+200B -> zero width space
      • Homoglyphs: different characters that look similar
      • Reordering
      • Deletions
  • Attack Methodology
    • Black-box threat model
    • Integrity Attack
    • Availability Attack
  • Generating Adversarial examples
    • Differential genetic evolution
  • Evaluations
    • Machine Translation
      • Integrity & availability attacks
    • Toxic content detection
    • Textual entailment attacks
    • Named Entity Recognition Attack
    • Sentiment Analysis Attack
  • Search Engine Attack
    • Attack on Searching
    • Attack on Indexing
  • Defenses:
    • Optical Character Recognition
      • Render Input
      • Interpret input with OCR
      • Feed output to NLP model
2022-03-31 Adam  

Website fingerprinting in Satellite connections

  • Attacker Setup:
    • Looks at directions and size of packets
      • Not content
    • Statistical signatures to identify websites
  • Attack
    • Adversary can circumvent defences like Tor/OpenVPN/QPEP Network
    • Attacker can censor a website
  • Question
    • Are satellite connections harder to fingerprint than regular ones?
  • Motivation
    • Size, direction of packets cannot be hidden
    • Tor can be fingerprinted
    • Are similar methods functional for satellite connections?
  • Satellite connections
    • Used in remote regions
    • Performance enhancing proxies (PEP)
      • Tradeoff between performance and security
  • Data
    • Satellite simulated traffic
    • 100 websites, 125 captures for each site
    • Collected with QPE, Tor or OpenVPN on top of satellite traffic
  • Features
    • Unique packet length
    • Packet ordering
    • # of outgoing packets
    • Initial packet lengths
    • Sums up to ~4000 features
  • Results
    • Low FPR
    • Most difficult was Tor Satellite (~60% accuracy) (100 classes)
  • Next Steps
    • Feature importance analysis
    • Deep Learning
    • Apply k-FP method (use outputs of RF as inputs to KNN)
2022-03-31 Diaa  

Privacy-preserving RNNs

  • Privacy concern with user data when training ML models

  • Privacy-preserving Machine Learning (PPML)
    • Private
    • Distributed
    • Confidential
    • Scenarios
      • Clear training, private inference (#1)
      • Private training, private inference (#2)
  • Private Training approaches
    • Differential Privacy
      • Degraded utility
    • SMPC
      • Honest-majority assumptions
      • High communication overhead
  • Multiparty homomorphic encryption
    • Public collective key
    • Secret-shared secret-key
      • Allows “all-but-one” collusion
    • Enc2Share and Share2Enc functionality
    • Achieves confidentiality
  • RNN
    • Recurrent Neural-Network
      • Retains information through time
        • Takes time sequences
      • Problem: sequential repeated operation
        • Blows up approximations
  • Literature
    • Crypto-RNN
      • Scenario #1
      • New ways to compute activation functions (approximations, …)
      • Cipher text “refresh” frequency
    • Split Learning (Fed-SL)
      • Model cut in several consecutive segments
      • Scenario #2
      • Just intermediate computations are shared
  • Design Challenges
    • Non-linear activation function
      • Minimax poly approximation
    • Heavy homomorphic operations
      • New cipher text packing scheme
    • Exploding gradients
    • Large circuit depth
2022-04-07 Xiaohe  

Constantine: Automatic Side-Channel Resistance using efficient control and data flow linearization

  • Trying to fix side-channels automatically
    • Side-channel
      • Secret-dependent control-flow
      • Secret-dependent memory access
    • Fix: use constant-time programming
      • Difficult and error-prone
      • Automation needed
  • Security concerns on Prior Work
    • Prefetch? Does not work for active attackers
    • Execute both sides of every secret-dependent branches (real & decoy path)
      • Exposing decoy path side-channels
  • Complete Linearization? State Explosion
    • Over-approximation on points to analysis
    • Loop linearization
  • Threat Model
    • Strong adversary can run arbitrary code on the machine
  • IR normalization
    • Assume IR comes in states single assignment (SSA) form
    • Single entry point, single exit point
  • Code analysis
    • Find secret-dependent portions
      • Assume that the dev. Has testsuite/fuzzers/workloads as a profiling suite
      • Output: a set of secret-dependent branches and memory accesses
      • Also during the dynamic analysis: loop trip counts
    • Points-to analysis: SVF library
      • Output: metadata for each use of a pointer expression in a sensitive load/store
  • Control Flow Linearization (CFL)
    • Flow
      • Remove the condition
      • Replace each pointer expression computation
      • Wrap load/store with constant load/store
      • At the end replace the exit nodes
    • Loop Linearization
    • Radical Linearization to resolve security concern
  • How to use Constantine
    • Easy for a single input file
      • Use it like gcc
    • Harder for multiple input files
      • Not supported yet
  • Pros & Cons
    • Pros
      • Performance improvement
      • Security improvement
      • Uses LLVM passes for each step
    • Cons
      • X86 only dependencies
      • Difficult to add to existing software
      • Not a sound analysis, cannot guarantee it will linearize all the secret-dependent portions
2022-04-07 Lucas  

ML-based adversarial example generation for Image Classification Deep-Learning Models

2022-05-06 Douglas  

A brief intro to Tamarin

  • Tool for formal verification of security protocols in the symbolic model
    • Uses a set of state machine transitions
    • Tamarin explores the state space until it finds an execution that violates security
    • Especially effective on key exchange protocols
  • Used in creating TLS 1.3
    • Months of person effort
    • 1 week of compute time & 100GB of RAM
  • Key encapsulation mechanisms
    • Abstraction of Diffie-Helmann
  • Examples
    • Breaks down a basic KEH exchange in 3 parts (client vs server)
  • Tamarin rule
    • Rule name
    • Shorthand-declaration
    • Requirements
    • Produces
  • Lemmas are predicates over “action facts” that are recorded in the rules
  • Was able to find man-in-the-middle attack on live demonstrated protocol
2022-05-13 Chelsea  

Proving the security of Schnorr

  • Signature scheme
    • 3 part protocol
    • Secret key, public key
  • Random Oracle Model
    • Aka “programmable” random oracle model
  • Simulation of Schnorr
    • Want to show unforgreability of Schnorr Signatures
      • Show that an adversary that can break it can break the Discrete Log problem
        • Needs the forking lemma
      • Shows reduction to Discrete Log
  • Also been proved to be secure using the Algebraic Group Model (AGM)
    • Also reduces to Discrete Log
  • Reduction to one more discrete logarithm
  • Take aways
    • Schnorr signatures are simple, but proving their security has nuances
    • Other models allow for tighter reductions but require non-standard assumptions
2022-05-20 Vahid  

QMA-hardness of Consistency of Local Density Matrices (CLDM) with applications to Quantum Zero-Knowledge

  • QMA: Quantum equivalent of NP
  • Density Matrices
    • Distribution over different states
    • Characterizes mixed states
  • CLDM
    • Given m pairs of (Ci, Pi)
      • Ci is a subset of size at most k of [n]
    • Previously proven to be QMA-hard but in a more restrictive setting
    • To prove QMA-completeness
      • Prove it’s QMA
      • Prove it’s QMA-hard
    • Relate it to the Hamiltonian problem (known to be QMA-hard) to prove completeness
  • Zero-knowledge
    • Malicious Verifier cannot get any information out of the honest prover besides whether an object is in the language
    • Commitment scheme
      • Need:
        • Computational hiding
        • Statistical binding
  • Goal: Construct a ZK-proof for QMA:
    • Use CLDM as a QMA-complete to do this
    • Protocol reduces to verification of QMA
  • Conclusion
    • The simple ZK proof follows the 3-coloring NP proof principle
    • Defined CLDM and that is was QMA-hard
      • Used it to construct a (simpler) computation ZK-proof for QMA
2022-05-20 Adithya  

Recent work on ORAM

  • MPC with Random Access
  • Previous solution
    • Touch every single element in the memory for every access
      • O(n) memory complexity
    • Floram (CCS2017)
      • Read and write to private index i.
      • Lay data linearly
      • Use distributed point functions
      • Access complexity O(n)
        • MPC server
      • From user’s perspective O(sqrt(n))
  • Point function:
    • 0 everywhere except at one point
  • Floram
    • Store the database as XOR shares
    • Read, write, refresh using XOR shares and DPF
  • DuORAM
    • More expensive DPF generation and expansion to “pre-processing”
    • Avoid “refresh”
  • Pre-processing
    • Two main problems
      • What to write
      • Where to write
    • Can we postpone what to write?
      • Defoliated DPFs
    • Postponing of where to write
      • Use cyclic shifts
    • Avoiding the refresh
      • Can we re-use blinded shares?
        • They are “wrong” only in one location
  • Linear cost DPF generation
    • New protocol with O(long) communication and computation
      • Uses LowMC based block-cipher
      • Conditional swaps
  • Summary
    • Constant amortized online communication cost
    • Offline cost of O(long) is acquired in the MPC generation phase
    • Use of computational PIR can turn that into a 2-party protocol
2022-05-27 Asim  

GNN fingerprinting for ownership demonstration

  • GNN: takes graph, node features and graph structure
    • Input: node features and adjacency matrix
  • Ownership verification: Desiderata
    • Effective
    • Efficient
    • Non-invasive
    • Robust
  • Model ownership & Dataset Ownership
  • Work exists for watermarking GNNs
    • Invasive and not very robust
  • No prior work for fingerprinting GNN that satisfies desiderata
  • 2 types of attacks
    • 1 adv. has the adjacency matrix and directly trains a surrogate model
  • Intuition
    • Use node embeddings
  • Experiment 1
    • Goal: analyze how the embeddings are affected by model architecture and training data
    • Results:
      • No overlap between target and independent models
        • Independent training will generate different embeddings
          • Cannot use fingerprinting for dataset ownership
  • Experiment 2
    • Analyze whether embeddings can distinguish between a surrogate and an independently trained model
    • Results
      • Targets and surrogate models fully overlap
      • Can use embeddings to fingerprint model ownership
2022-05-27 Lindsey  

Obfuscated VN and UDP based PTs

  • LEAN Encryption Access Project
    • White label VPN
    • Recent increase of usage in LeapVPN (Russia-Ukraine War)
    • They are working on offs 4 over UDP and Snowflake
  • Fingerprintability of Open VPN
    • Op codes and ACK sequences are fingerprintable features
    • Commercial VPNs fail to implement obfuscation to hide these features
  • Mitigations
    • Short-term
      • No co-location
      • Random padding
      • Less predictable HS failure
    • Long-term
      • Standardized obfuscation solutions (PTs)
      • Transparency around methods used
    • UDP based pluggable transports (PTs)
      • Excitement around HTTP/3 protocols (QUICK/KCP/SCTP)
      • Paper: Web Censorship measurements of HTTP/3 over QUICK
  • Obfuscation
    • Obsf4
      • Requires TCP
2022-06-03 Setareh  

Taking a look into Execute-Only Memory

  • Complex software is often not developed by a single company but built upon software provided by other companies
  • XOM (Execute only memory)
    • Only separates flash memory but everything else is still shared
    • Look at before and after state at CPU-level
      • Info leakage
  • Instruction Recovery
    • Process
      • Input state generation
        • Look at the register values
      • Instruction execution
      • Instruction solving
  • Can determine a significant amount of instructions
  • Limitations
    • Some instructions have the same input/output state
      • Cannot distinguish between them
  • Mitigation
    • Disabling debug interface
    • Use a TEE
2022-06-10 Simon  

IHOP: Iteration Heuristic for (Quadratic) Optimization Problems

  • Gonna present this talk at USENIX later this year
  • Overview: Searchable Encryption
    • Leakage of the access pattern
      • Enough for serve to recover some contents from certain files
    • Statistical attack
  • Leakage & Adversary model
    • Auxiliary Information
      • Adv. builds volume co-occurrence matrix
      • Adv. Goal: match keyword to tokens sent by Alice
  • Quadratic Assignment Problem (QAP)
    • Finding a permutation matrix
    • NP-hard
    • Simplification:
      • Linear Assignment Problem (LAP)
        • Lots of wasted of information, but it is efficient
  • IHOP
    • Solve LAP with a fixed/frozen row/column that are used as auxiliary information
      • “Cats are in boxes” -> extra information
      • Improves the estimations
  • Significantly outperforms previous work
  • Overview of Pancake
    • Smoothens access frequency
      • Makes it uniform
    • Batch each query with 2 other queries
  • What if there are query dependencies
    • Transitions are not uniform anymore
  • Pancake is still very effective but it can be attacked
    • Accuracy against pancake under different settings/datasets ranges from ~20 to 60%
  • Sneak Peek
    • Using ML for SubGrap Matching
2022-06-10 Diogo  

DeepSE-WF: Unified Security Estimation for Website Fingerprinting defences

  • VPNs protect user traffic from eavesdroppers
    • It still leaks traffic patterns
    • Different websites generate different traffic patterns
      • Adversary can create a database of website traces and match Alice’s traffic
  • Tor aims to make traffic analysis harder
    • Still leaks volume, direction and timing info about packet
    • Hence does not prevent website fingerprinting
      • Tor can be seen as a big VPN node
  • How can one defend against website fingerprinting?
    • Constant-rate padding
    • Supersequence
    • Adaptive padding
    • Application-layer
    • Traffic splitting
    • Trace noise/merge
    • Adversarial defences
  • How to evaluate them?
    • On close-world setting
      • Hardest for the adversary
    • Security evaluation as an iterative cat-and-mouse game
      • Highly dependent on new attacks and their ML classifiers
        • Fix: Classifier-Independent defence evaluation
          • Estimate smallest achievable error
  • Contributions
    • Identification of a disconnect between the features used in current WF attacks and those used by existing WF security estimation approaches
    • …
  • DeepSE-WF security estimation pipeline
    • Generate defended traces
    • Learn latent features
    • Estimate BER/MI
      • k-NN based
      • BER: Bayes Error Rate
      • MI: Mutual Information
  • Results
    • DeepSE-WF produces tighter BER estimates
      • And scales better for larger number of samples
    • Also produces more convincing MI estimates
    • Overestimates BER
    • Tighter bounds more efficiently
  • Conclusion
    • Current security estimation frameworks over-estimate
    • Proposed DeepSE-WF
2022-06-24 Vasisht  

Towards Effective Measurement of Membership Privacy Risk for Machine Learning Models

  • ML models are currently widely adopted
  • Occasionally trained on sensitive data
  • Membership Inference Attack
    • Infer whether a data record sampled from data distribution was in the training data
    • Cause: ML model memorizes some training data records making them vulnerable to inference attacks
    • Model is more confident on seen training data than unseen testing data
  • Why do we care about membership privacy risk?
    • Regulatory requirements
    • Membership inference attacks (MIAs) leak sensitive training data
  • Desiderata for measuring risk:
    • Fine-grained: privacy of individual training records
    • Attack agnostic: independent of specific MIAs
    • Effective
    • Applicable
  • No previous work satisfies all of the desiderata (some come close)
  • Indistinguishability Game
    • Distinguish between 2 settings (D vs D+x)
    • Adversary queries the model and tries to identify the setting
    • Inefficient
  • Shapley Values: alternative to the leave-one-out approach
    • Game-theoretic approach to equitably assign utility among different players
    • Can they be an effective and efficient membership privacy risk measurement value
  • Shapley Values Properties
    • Semantic Meaning
    • Additive
    • Heterogeneous
    • R1: fine-grained
    • R2: attack-agnostic
    • But not efficient
  • SHAPr: Computing Shapley values via KNN
    • R3: evaluating effectiveness of SHAPr
  • Privacy risk over sensitive subgroups
  • SHAPr inherits applicability to data valuation.
  • Defences: effect of noise addition
    • Add noise to training data records to lower their privacy risk
    • Not a very effective defence
  • Defences: Data removal of high risk records
    • Does not improve privacy
  • Defences: evaluating L2 Regularization
    • Kind of works
  • R5: SHAPr’s efficiency
    • Executing from 2 to 90 min (one-time cost)
      • 100x faster than naive leave-one-out approach
  • Summary
    • SHAPr is a tool for model builders to asses the membership privacy risk
2022-07-08 Jason  

Verifpal: Formal Verification Made Easy

  • Formal Verification
    • Check security of protocols
    • It’s hard
      • Need to be very careful
    • Long run times/may never terminate
  • Verifpal
    • Intuitive language
    • Helps avoid modeling errors
    • Easy to understand overall
    • Simple language syntax
  • Used by many already
    • For example Zoom
  • Demo time!
    • Simple public key cryptography example
  • Some problems
    • Cannot define new primitives (unless you submit a pull request to their GitHub codebase)
2022-07-08 Faezeh  

Privacy for Free: How does Dataset Condensation Help Privacy?

  • Problem
    • Privacy attacks on ML models
    • Issues with DP-generators?
      • Low utility induced by DP lowers model accuracy unless more data is available
  • Solution
    • Dataset Condensation
    • Paper introduces DC as a replacement for the DP-generator
  • Theoretical Analysis Results
    • Prove that linear DC extractors follow the concept of DP that one element does not greatly change the model parameter distribution
  • Empirical Results
    • DC-synthesized data are not perceptually similar to the original data and cannot be reversed using similarity metrics
  • Discussion
    • Where is the trade-off? It sounds like free lunch.
2022-07-15 Emily  

Differential Privacy for Nearest Neighbor Queries

  • Motivation
    • Finding stores/other locations nearby in a differentially private manner.
    • Query Protection
    • Database Protection
    • New method for executing DP nearest neighbour searches using trees
    • Looking for Geo-Indistinguishability
  • K-D trees
    • Tree where each node stores two coordinates
    • Alternating between each coordinate when following the nodes down the tree
    • Allows to quickly find close points in 2d space
    • K-D trees are efficient if there are many more points that there are dimensions
  • Honest but curious server
  • Privacy for the query not the tree
  • Looked at variations of the basic traversals
    • Parallel traversal
    • Early Stopping
    • Greedy spliiting
    • Greedy splitting with neighbourhoods
  • Distance metric is not better than geo-indistinguishability
  • Comparison metric offers a method independent from any distance restriction
  • Different traversal algorithms provide methods for better accuracy
2022-07-15 Justin  

Signal’s Sealed Sender Service, Summarized:

Signatures Sent Secretly Simply Secure State

  • Signal’s metadata protection and the guarantees it provides.
  • Metadata: data about data
  • Example: WhatsApp: only encrypts the messages but not the metadata
    • And it shares it to governments
  • Protecting metadata:
    • Sealed Sender
    • Keep at little state as possible on their server
    • Messages are end-to-end encrypted and metadata is usually not recorded
  • Signal is working towards hiding another piece of metadata: who is messaging whom
  • Also trying to provide provable guarantees around protecting metadata.
  • Sealed Sender
    • Encrypt the sender as part of the message content
  • Signal checkmarks
    • One when message is received by Signal
    • Another one when the message is received by the intended recipient
  • Statistical disclosure attack risk.
  • Can create batches of people that receive message soon after Bob sent a message
    • In the aggregate with random samples allows to distinguish who is talking to Bob.
2022-08-12 Navid  

All-or-Nothing Transforms and their applications in Flexible and Efficient Secret Sharing

  • Types of Security
    • Computational
    • Semantic
    • Information Theoretic
      • Very very secure
      • Is it necessary?
        • Yes: quantum computers
  • All-or-Nothing Transforms
    • Need all n compute blocks to learn anything about any single block
    • Extended to t-AONT
      • Same thing but only need n-t blocks instead of n and they learn about n-t blocks
  • Example on protecting pairs out of 3 combinations of 3 characters.
  • Different kinds of AONTs
    • Restricted
    • Asymmetric
  • Cauchy Matrix
  • More recent results
    • If all results are not equiprobable, we may have some information leakage
    • As long as all inputs are possible, all outputs are possible
  • Post-quantum Hash-based Group Signature
    • Group members to sign messages on behalf of a group
    • Players:
      • Key issuing authority
      • Verifier
      • Openers
      • Members
    • Openers have keys with a single colour.
    • Divide keys such that every member has no two keys that have the same colours.
    • Problem:
      • For each signature, there are two openers who can trace the signer
        • 2 is small
      • We can use AONT on the opener keys
  • Other Applications
    • Secure Distributed Storage
      • Prevent storage groups from knowing what you stored on their servers.
  • Other projects:
    • a study on remote Storage Usage
    • Locally Reconstruction Codes
      • Recover lost piece of information from nearby servers
    • Restricted Overlapping Codes
    • Search for more instances of (28, 10, 5)-BIBDs
      • You have 28 points along 10 sets of lines and each point appears on 5 lines and any two point only share one line
2022-09-16 Florian Private_Collaborative_Data_Cleaning.pdf

Private Collaborative Data Cleaning

We introduce and investigate the privacy-preserving version of collaborative data cleaning. With collaborative data cleaning, two parties want to reconcile their data sets to filter out badly classified, misclassified data items. In the privacy-preserving (private) version of data cleaning, the additional security goal is that parties should only learn their misclassified data items, but nothing else about the other party's data set. The problem of private data cleaning is essentially a variation of private set intersection (PSI), and one could employ recent circuit-PSI techniques to compute misclassifications with privacy. However, we design, analyze, and implement three new protocols tailored to the specifics of private data cleaning that significantly outperform a circuit-PSI-based approach. With the first protocol, we exploit the idea that a small additional leakage (the size of the intersection of data items) allows for runtime and communication improvements of more than one order of magnitude over circuit-PSI. The other two protocols convert the problem of finding a mismatch in data classifications into finding a match, and then follow the standard technique of using oblivious pseudo-random functions (OPRF) for computing PSI. Depending on the number of data classes, this leads to either total runtime or communication improvements of up to two orders of magnitude over circuit-PSI.

2022-09-16 David   I talked about breaking SIKE, based on "An efficient key recovery attack on SIDH" https://eprint.iacr.org/2022/975 by Castryck and Decru and subsequent improvements by Wesolowski.
2022-09-23 Miti Miti-CrySP_meeting_talk.pdf I (Miti) talked about data minimization for distributed microservices architectures in this CrySP talk. I focused on two privacy violations: customers' data being used for secondary purposes-of-use and being shared extensively with third parties. I proposed two frameworks: a Privacy Observability Framework (POF) and a Privacy Enforcement Framework (PEF). The POF detects privacy violations in the testing stage, which are to be fixed by developers, whereas the PEF prevents privacy-violating microservices from getting users' data in the production stage. I discussed the design of these frameworks in terms of a data and a control plane. A key design takeaway is to use existing tooling, such as distributed tracing, to build privacy engineering frameworks.
2022-09-30 Nils   We derive metrics for empirically measuring the leakage of personally identifiable information (PII) in autoregressive language models such as GPT-2. Our findings demonstrate that differential privacy still leaks PII in practice and needs to be supplemented with data curation methods such as scrubbing. We evaluate defences from related work in their ability to protect PII and propose a heuristic scrubbing method that only scrubs token susceptible to leakage. Our defence improves the utility-privacy trade-off of existing defences.
2022-09-30 Ru   The effectiveness of formal verification depends on the quality of the specifications. Even the fully verified program can only be as correct as the specifications. In this work, we focus on 1) producing a mutant that is more likely to pass the verification; 2) categorize the surviving mutant — whether it’s intentional or by mistake.
2022-10-14 Ian

Slides: Ian-otspir.pdf

Rust implementation

Menon and Wu Oakland 2022 PIR paper

Menon and Wu PIR code

Bellare and Micali Crypto 1989 OT paper

Naor and Pinkas STOC 1999 PIR-to-SPIR transform

Oblivious Transfer and Symmetric Private Information Retrieval

This talk describes different ways for a receiver to retrieve information in a database held by a sender, without the sender learning what information was being requested. Insisting that the receiver learn only a single database element yields oblivious transfer (if communication the size of the whole database is allowed) or symmetric private information retrieval (if it is not). The talk shows a simple construction of 1-of-2 oblivious transfer, and how to use it to achieve symmetric private information retrieval. A Rust implementation of these protocols is available.

2022-10-14 Bailey   Private computation, which includes techniques like multi-party computation and private query execution, holds great promise for enabling organizations to analyze data they and their partners hold while maintaining data subjects' privacy. Despite recent interest in communicating about differential privacy, end users' perspectives on private computation have not previously been studied. To fill this gap, we conducted 22 semi-structured interviews investigating users' understanding of, and expectations for, private computation over data about them. Interviews centered on four concrete data-analysis scenarios (e.g., ad conversion analysis), each with a variant that did not use private computation and one that did (private set intersection, multiparty computation, and privacy preserving query procedures). While participants struggled with abstract definitions of private computation, they found the concrete scenarios enlightening and plausible even though we did not explain the complex cryptographic underpinnings. Private computation increased participants' acceptance of data sharing, but not unconditionally; the purpose of data sharing and analysis was the primary driver of their attitudes. Through co-design activities, participants emphasized the importance of detailing the purpose of a computation and clarifying that inputs to private computation are not shared across organizations when describing private computation to end users.
2022-10-21 Thomas https://arxiv.org/abs/2107.12407

Selective MPC: Distributed Computation of Differentially Private Key-Value Statistics

Key-value data is a naturally occurring data type that has not been thoroughly investigated in the local trust model. Existing local differentially private (LDP) solutions for computing statistics over key-value data suffer from the inherent accuracy limitations of each user adding their own noise. Multi-party computation (MPC) maintains better accuracy than LDP and similarly does not require a trusted central party. However, naively applying MPC to key-value data results in prohibitively expensive computation costs. In this work, we present selective multi-party computation, a novel approach to distributed computation that leverages DP leakage to efficiently and accurately compute statistics over key-value data. By providing each party with a view of a random subset of the data, we can capture subtractive noise. We prove that our protocol satisfies pure DP and is provably secure in the combined DP/MPC model. Our empirical evaluation demonstrates that we can compute statistics over 10,000 keys in 20 seconds and can scale up to 30 servers while obtaining results for a single key in under a second.

2022-10-21 Rasoul https://arxiv.org/abs/2209.13913 Faster Secure Comparisons with Offline Phase for Efficient Private Set Intersection

I presented our paper that recently got accepted to NDSS 2023. In this paper, we propose PSI with an online and offline phase, such that the online phase is highly efficient. In this online phase, elements are compared with a very fast comparison operator which requires only 4 operations in the field. The evaluation shows that our online phase is faster than all existing approaches with an online phase.

The online phase makes use of OLE tuples, which are generated in the offline phase. We propose 3 approaches to generate OLE tuples: A trusted third party, trusted execution environments, and cryptographic approaches (OT and HE). We show the tradeoff that each approach in the offline phase offers and evaluate them in terms of runtime and communication cost, in various network settings.
2022-10-28 Seb https://arxiv.org/abs/2207.01991 Nowadays, systems based on machine learning (ML) are widely used in different domains. Given their popularity, ML models have become targets for various attacks. As a result, research at the intersection of security/privacy and ML has flourished. Typically such work has focused on individual types of security/privacy concerns and mitigations thereof. However, in real-life deployments, an ML model will need to be protected against several concerns simultaneously. A protection mechanism optimal for one security or privacy concern may interact negatively with mechanisms intended to address other concerns. Despite its practical relevance, the potential for such conflicts has not been studied adequately. We first provide a framework for analyzing such "conflicting interactions". We then focus on systematically analyzing pairwise interactions between protection mechanisms for one concern, model and data ownership verification, with two other classes of ML protection mechanisms: differentially private training, and robustness against model evasion. We find that several pairwise interactions result in conflicts. We explore potential approaches for avoiding such conflicts. First, we study the effect of hyperparameter relaxations, finding that there is no sweet spot balancing the performance of both protection mechanisms. Second, we explore if modifying one type of protection mechanism (ownership verification) so as to decouple it from factors that may be impacted by a conflicting mechanism (differentially private training or robustness to model evasion) can avoid conflict. We show that this approach can avoid the conflict between ownership verification mechanisms when combined with differentially private training, but has no effect on robustness to model evasion. Finally, we identify the gaps in the landscape of studying interactions between other types of ML protection mechanisms.
2022-11-04 Lachlan Slides

Cryptographically authenticated in-memory data structures, or: How to build secure software without secure memory

Memory safety vulnerabilities continue to be a major issue for software today, and can give attackers read-write access to any data that a program stores in memory. In this talk I discuss how to cryptographically protect the data in memory and, more generally, the problem developing secure software when the contents of memory cannot be trusted.

2022-11-04 Dheeraj   In my talk I presented the paper Uncovering Cross-Context Inconsistent Access Control Enforcement in Android by zhou et al. In this paper, the authors address the problem of cross-context access control consistencies. Previous work has shown that access control inconsistencies in System Services, which provide core android functionality, can lead to security breaches. However, such System Services are implemented across both the Java and the native layer of the android. Existing research has focused only on detecting inconsistent access control in the Java context of system services. In their work, the authors explain that if a resource can be accessed through both the Java layer and native layer, the access checks to reach that resource must be consistent across both the contexts, or else, an attacker can take the less restrictive path to access that resource. The authors propose the tool IAceFinder which is able to discover such access control inconsistencies across Java and Native services in Android ROMs using static analysis. To evaluate their tool, the authors analyse 14 Android ROMS, 2 official AOSP Roms and 12 open-source customizations. In total they find 23 inconsistencies, 19 type 1 and 4 type-2, which were reported and acknowledged by Google.
2022-11-18 Hossam

https://dl.acm.org/doi/10.1145/3470496.3527393

Slides

I presented the CraterLake paper, mentioned in one of the CCS keynote speeches. The authors present a new state-of-the-art accelerator, CraterLake, for accelerating fully homomorphic encryption operations. One crucial contribution is enabling CraterLake to perform bootstrapping efficiently, which was not done in prior work. They manage to achieve significant performance speedups compared to both CPUs and a prior state-of-the-art FHE accelerator (F1). The resulting speedup enables performing FHE-based ResNet-20 single image inference in approx. 250ms.
2022-11-25 Ian M    
2022-11-25 Yousra    
2022-12-02 Urs Slides Some Random Advice on using Machine Learning for Security or Privacy
2022-12-09 Andre    
2022-12-09 Georgios    
Topic revision: r1 - 2023-01-24 - HossamElAtali
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback