Please note: This master’s thesis presentation will take place online.
Anastasiia Avksientieva, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Lukasz Golab
Machine learning models often exhibit not only explicit bias-unequal performance metrics across subgroups-but also implicit bias, where altering a model’s prediction is disproportionately difficult across subgroups. In this work, we investigate two complementary approaches to analyze ways to overturn a model's decision to achieve a desired label: modifying test input features and unlearning a set of training samples. The novelty of our solution lies in combining these two methods with data summarization via informative rule mining that highlights biased subgroups.
We demonstrate the value of REACT by allowing users to detect a model’s implicit bias and compare the biases of different model versions. The resulting framework is flexible, supporting the definition of practical constraints on feature-level interventions-for example, by limiting changes to modifiable attributes.