Please note: This master’s thesis presentation will take place in DC 3317.
Xinxin Yu, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Anamaria Crisan
Code-generating models are increasingly used to support data science tasks. However, reviewing and validating their outputs remains largely manual and time-consuming, requiring users to understand how generated code works and to assess its quality and correctness. Rather than eliminating effort, these models often shift user work from writing code to verifying it. This challenge is further compounded by the fact that different models frequently produce diverse solutions with varying levels of effectiveness, making systematic comparison and evaluation difficult.
To address these challenges, this thesis presents DSCode Comparator, an interactive system designed to support code understanding, evaluation, refinement, and comparison in data science workflows. The system enables users to examine code at multiple levels of granularity, ranging from individual lines of code to complete solutions across different prompts and tasks. DSCode Comparator incorporates an automated annotation pipeline that analyzes generated code and provides structured, line-level explanations to facilitate rapid comprehension. In addition, the system evaluates code quality along multiple functional and pragmatic dimensions, including efficiency, readability, usability, and resource usage.
Beyond individual code inspection, DSCode Comparator supports comparative analysis across models by aggregating annotations and evaluation results into compact summaries that highlight key differences in behavior and performance. Through a combination of empirical evaluation and user studies with data science practitioners, this thesis demonstrates that the proposed approach improves users’ ability to understand, compare, and refine code generated by large language models, reducing verification effort while supporting more informed decision-making in model-assisted programming.