PhD Seminar • Machine Learning | Reinforcement Learning • Inverse Probabilistic Constraint Learning

Tuesday, December 3, 2024 11:00 am - 12:00 pm EST (GMT -05:00)

Please note: This PhD seminar will take place online.

Ashish Gaurav, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Pascal Poupart

Given demonstrations from a constrained expert, inverse constrained reinforcement learning recovers a reward and constraint(s) that can explain the expert’s behaviour. The recovered reward and constraint(s) can be interpreted as the motivation behind the expert’s actions in an underlying constrained Markov decision process setting, i.e. the reward being a long-term objective that the expert tries to optimize, while the constraint limits the space of optimal policies depending on the feasibility criterion.

Following recent work, we assume a setting where the reward is known, and a single constraint needs to be learned. For such a setting, we provide the first framework to learn a probabilistic constraint from probabilistically constrained expert demonstrations. A probabilistic constraint directly bounds the cumulative probability of costs to be above a certain threshold. We first show how existing constraint learning methods can be used to learn a probabilistic constraint. Following this, we provide a principled method to directly learn a probabilistic constraint. We conduct synthetic and real-world experiments to validate our proposed algorithm.


Attend this PhD seminar on Zoom