Master’s Thesis Presentation • Human–Computer Interaction • DSCode Comparator: An Interactive Interface for Comparing Models and Evaluating Code for Data Science Tasks | Cheriton School of Computer Science

Thursday, March 19, 2026 1:00 pm - 2:00 pm EDT (GMT -04:00)

Please note: This master’s thesis presentation will take place in DC 3317.

Xinxin Yu, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Anamaria Crisan

Code-generating models are increasingly used to support data science tasks. However, reviewing and validating their outputs remains largely manual and time-consuming, requiring users to understand how generated code works and to assess its quality and correctness. Rather than eliminating effort, these models often shift user work from writing code to verifying it. This challenge is further compounded by the fact that different models frequently produce diverse solutions with varying levels of effectiveness, making systematic comparison and evaluation difficult.

To address these challenges, this thesis presents DSCode Comparator, an interactive system designed to support code understanding, evaluation, refinement, and comparison in data science workflows. The system enables users to examine code at multiple levels of granularity, ranging from individual lines of code to complete solutions across different prompts and tasks. DSCode Comparator incorporates an automated annotation pipeline that analyzes generated code and provides structured, line-level explanations to facilitate rapid comprehension. In addition, the system evaluates code quality along multiple functional and pragmatic dimensions, including efficiency, readability, usability, and resource usage.

Beyond individual code inspection, DSCode Comparator supports comparative analysis across models by aggregating annotations and evaluation results into compact summaries that highlight key differences in behavior and performance. Through a combination of empirical evaluation and user studies with data science practitioners, this thesis demonstrates that the proposed approach improves users’ ability to understand, compare, and refine code generated by large language models, reducing verification effort while supporting more informed decision-making in model-assisted programming.

Location Information

Location Address: DC - William G. Davis Computer Research Centre
200 University Avenue West
DC 3317
Waterloo, ON, CA N2L 3G1

Location coordinates: