PhD Seminar • Machine Learning | Reinforcement Learning • Learning Soft Constraints From Constrained Expert Demonstrations | Cheriton School of Computer Science

Please note: This PhD seminar will take place in DC 2585 and online.

Ashish Gaurav, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Pascal Poupart

Inverse reinforcement learning (IRL) is a growing subfield within reinforcement learning (RL). IRL aims to recover a reward function given access to an optimal policy (typically through demonstrations from an expert), which is the opposite of standard RL, which learns an optimal policy given a reward function. The typical assumption in IRL is that the expert data is generated by an agent optimizing just a reward function. However, in many settings, the agent may optimize a reward function subject to some constraints, where the constraints induce behaviors that may be otherwise difficult to express with just a reward function.

Recovering both the reward and constraint(s) is a difficult problem due to the issue of unidentifiability, therefore, we consider the setting where the reward function is given, and the constraint is unknown, and propose a method that is able to recover the constraint satisfactorily from the expert data. While previous work has focused on recovering constraints in this setting, they usually learn hard constraints. On the other hand, our method can recover cumulative soft constraints that the agent satisfies on average per episode. In IRL fashion, our method solves this problem by adjusting the constraint function iteratively through a constrained optimization procedure, until the agent behavior matches the expert behavior. We demonstrate our approach on synthetic environments, robotics environments and real world highway driving scenarios, and discuss the results and possible implications.

Paper link: https://openreview.net/forum?id=8sSnD78NqTN

To attend this PhD seminar in person, please go to DC 2585. You can also attend using Zoom at https://vectorinstitute.zoom.us/j/83195064721.

Location Information

Location Address: DC - William G. Davis Computer Research Centre
200 University Avenue West
Hybrid: DC 2585 | Online PhD seminar
Waterloo, ON, CA N2L 3G1

Location coordinates: