Chengnian Sun and colleagues receive Most Influential Paper Award at SANER 2022 | Cheriton School of Computer Science

Professor Chengnian Sun and his colleagues Yuan Tian and David Lo have received a Most Influential Paper Award at SANER 2022, the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering.

Their paper, titled “Information retrieval based nearest neighbor classification for fine-grained bug severity prediction,” was presented a decade ago at the 2012 Working Conference on Reverse Engineering. WCRE as it is known was later merged with the Conference on Software Maintenance and Reengineering to become the SANER project.

Chengnian Sun joined the Cheriton School of Computer Science in 2019 as an Assistant Professor. His research interests are in software engineering and programming languages. He focuses on techniques, tools and methodologies to improve software quality and developers’ productivity.

Before joining the Cheriton School of Computer Science, Professor Sun was a software engineer at Google in Mountain View, California, and before that a postdoctoral researcher in the Department of Computer Science at the University of California, Davis. He has a PhD in Computer Science from the National University of Singapore.

About this award-winning research

Software systems usually have bugs. Although some are critical and need to be repaired right away, others are minor and can be fixed when resources are available.

Some projects allow users to report defects through reporting systems such as Bugzilla, which allows users to not only describe the bug, but to also estimate its severity. Although guidelines have been developed to rank bug severity, the process is manual and depends largely on the expertise of the bug reporter to assign the correct label. A novice might find it difficult to determine severity, making it difficult for developers to prioritize which bugs to fix first.

Professor Sun and his colleagues developed an approach to infer severity labels from information available from bug reports, using an information retrieval–based nearest neighbour solution to predict fine-grained severity labels. They then tested their approach and compared it with state-of-the-art work using a collection of more than 65,000 reports from the bug repositories of three large open source projects. Compared with other methods, their approach was up to a hundred times more accurate, especially on hard-to-predict severity labels.