Please note: This seminar will take place in DC 1304 and virtually over Zoom.
Pengyu Nie, PhD candidate
Department of Electrical and Computer Engineering
University of Texas at Austin
Machine Learning (ML) techniques have been increasing adopted for Software Engineering (SE) tasks, such as code completion and code summarization. However, existing ML models provide limited value for SE tasks, because these models do not take into account the key characteristics of software: software is executable and software constantly evolves. In this talk, I will present my insights and work on developing execution-guided and evolution-aware ML models for several SE tasks targeting important domains, including software testing, verification, and maintenance.
First, I will present my techniques to help developers write tests and formal proofs. My work has direct impact on software correctness and everyone that depends on software. I will present TeCo: the first ML model for test completion/generation, and Roosterize: the first model for lemma name generation. In order to achieve good performance, these two tasks require reasoning about code execution, which existing ML models are not capable of. To tackle this problem, I designed and developed ML models that integrate execution data and use such data to validate generation results.
Next, I will present my techniques to help developers maintain software. Specifically, I will present my work on comment updating, i.e., automatically updating comments when associated code changes. I proposed the first edit ML model for SE to solve this task, which learns to perform developer-like edits instead of generating comments from scratch. This model can be generalized for general-purpose software editing, including tasks such as bug fixing and automated code review.
All my code and data are open-sourced, evaluated on real-world software, and shown to outperform existing ML models by large margins. My contributions lay the foundation for the development of accurate, robust, and interpretable ML models for SE.
Bio: Pengyu Nie is a Ph.D. candidate at the University of Texas at Austin, advised by Milos Gligoric. Pengyu obtained his Bachelor’s Degree at the University of Science and Technology of China. His research area is the fusion of Software Engineering (SE) and Natural Language Processing (NLP), with a focus on improving developers’ productivity during software development, testing, and maintenance. He has published 14 papers in top-tier SE, NLP, and PL conferences. He is the recipient of an ACM SIGSOFT Distinguished Paper Award (FSE 2019), and the UT Austin Graduate School Continuing Fellowship.
More information can be found on his webpage: https://pengyunie.github.io.