Master’s Thesis Presentation • Cryptography, Security and Privacy (CrySP) • Security and Ownership Verification in Deep ReinforcementExport this event to calendar

Thursday, July 7, 2022 — 1:30 PM to 2:30 PM EDT

Please note: This master’s thesis presentation will be given online.

Ching-wen Wang, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Professor N. Asokan

Deep reinforcement learning (DRL) has seen many successes in complex tasks such as robot manipulation, autonomous driving, and competitive games. However, there are few studies on the security threats against DRL systems. In this thesis, we focus on two security concerns in DRL.

The first security concern is adversarial perturbation attacks against DRL agents. Adversarial perturbation attacks mislead DRL agents into taking sub-optimal actions by applying a small imperceptible perturbation to the states of the environment. Adversarial perturbation attacks mislead DRL agents into taking sub-optimal actions. These attacks apply small imperceptible perturbations to the agent's observations of the environment. Prior work shows that DRL agents are vulnerable to adversarial perturbation attacks. However, prior attacks are difficult to deploy in real-time settings. We show that universal adversarial perturbations (UAPs) are effective in reducing a DRL agent's performance in their tasks and are fast enough to be mounted in real-time. We propose three variants of UAPs. We evaluate the effectiveness of UAPs against different DRL agents (DQN, A2C, and PPO) in three different Atari 2600 games (Pong, Freeway, and Breakout). We show that UAPs can degrade agent performance by 100%, in some cases even for a perturbation bound as small as l = 0.01. We also propose a technique for detecting adversarial perturbation attacks. An effective detection technique can be used in DRL tasks with potentially negative outcomes (such as the agents failing in a task or accumulating negative rewards) by suspending the task before the negative result manifests due to adversarial perturbation attacks. Our experiments found that this detection method works best for Pong with perfect precision and recall against all adversarial perturbation attacks but is less robust for Breakout and Freeway.

The second security concern is theft and unauthorized distribution of DRL agents. As DRL agents gain success in complex tasks, there is a growing interest to monetize them. However, the possibility of theft could jeopardize the profitability of deploying these agents. Robust ownership verification techniques can deter malicious parties from stealing these agents, and in the event where theft cannot be prevented, ownership verification techniques can be used to track down and prosecute perpetrators. There are two prior works on ownership verification of DRL agents using watermarks. However, these two techniques require the verifier to deploy the suspected stolen agent in an environment where the verifier has complete control over the environment states. We propose a new fingerprint technique where the verifier compares the percentage of action agreement between the suspect agent and the owner's agent in environments where UAPs are applied. Our experimental results show that there is a significant difference in the percentage of action agreement (up to 50% in some cases) when the suspect agent is a copy of the owner's agent versus when the suspect agent is an independently trained agent.


To join this master’s thesis presentation on Zoom, please go to https://uwaterloo.zoom.us/j/95938111155.

Location 
Online master’s thesis presentation
200 University Avenue West

Waterloo, ON N2L 3G1
Canada
Event tags 

S M T W T F S
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
2
3
  1. 2022 (167)
    1. August (26)
    2. July (23)
    3. June (17)
    4. May (20)
    5. April (24)
    6. March (22)
    7. February (16)
    8. January (19)
  2. 2021 (210)
    1. December (21)
    2. November (13)
    3. October (12)
    4. September (21)
    5. August (20)
    6. July (17)
    7. June (11)
    8. May (16)
    9. April (27)
    10. March (20)
    11. February (13)
    12. January (19)
  3. 2020 (217)
  4. 2019 (255)
  5. 2018 (217)
  6. 2017 (36)
  7. 2016 (21)
  8. 2015 (36)
  9. 2014 (33)
  10. 2013 (23)
  11. 2012 (4)
  12. 2011 (1)
  13. 2010 (1)
  14. 2009 (1)
  15. 2008 (1)