Professor Chengnian Sun and his colleagues Yuan Tian and David Lo have received a Most Influential Paper Award at SANER 2022, the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering.
Their paper, titled “Information retrieval based nearest neighbor classification for fine-grained bug severity prediction,” was presented a decade ago at the 2012 Working Conference on Reverse Engineering. WCRE as it is known was later merged with the Conference on Software Maintenance and Reengineering to become the SANER project.

Chengnian
Sun
joined
the
Cheriton
School
of
Computer
Science
in
2019
as
an
Assistant
Professor.
His
research
interests
are
in
software
engineering
and
programming
languages.
He
focuses
on
techniques,
tools
and
methodologies
to
improve
software
quality
and
developers’
productivity.
Before
joining
the
Cheriton
School
of
Computer
Science,
Professor
Sun
was
a
software
engineer
at
Google
in
Mountain
View,
California,
and
before
that
a
postdoctoral
researcher
in
the
Department
of
Computer
Science
at
the
University
of
California,
Davis.
He
has
a
PhD
in
Computer
Science
from
the
National
University
of
Singapore.
About this award-winning research
Software systems usually have bugs. Although some are critical and need to be repaired right away, others are minor and can be fixed when resources are available.
Some projects allow users to report defects through reporting systems such as Bugzilla, which allows users to not only describe the bug, but to also estimate its severity. Although guidelines have been developed to rank bug severity, the process is manual and depends largely on the expertise of the bug reporter to assign the correct label. A novice might find it difficult to determine severity, making it difficult for developers to prioritize which bugs to fix first.
Professor Sun and his colleagues developed an approach to infer severity labels from information available from bug reports, using an information retrieval–based nearest neighbour solution to predict fine-grained severity labels. They then tested their approach and compared it with state-of-the-art work using a collection of more than 65,000 reports from the bug repositories of three large open source projects. Compared with other methods, their approach was up to a hundred times more accurate, especially on hard-to-predict severity labels.