Master’s Research Paper Presentation • Software Engineering • Revisiting the Impact of Crash Report Deduplication on Bug Fixing — A Case Study of Mozilla Firefox

Wednesday, July 31, 2024 2:00 pm - 3:00 pm EDT (GMT -04:00)

Please note: This presentation will take place in DC 2564 and online.

Wen Cui, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Shane McIntosh

Software teams often collect field reports of software crashes to aid in their diagnosis, reproduction, and repair. These reports include key details, such as stack traces, platform details, and setting conditions. As software becomes more complex, the frequency of issues in the field also tends to increase, resulting in a large volume of crash reports. To streamline the handling of these reports and reduce redundancy, crash report deduplication approaches group similar crash reports together.

In this study, we set out to understand the impact of crash report deduplication methods on fixing bugs associated with Mozilla Firefox crash reports. By default, Mozilla deduplicates crash reports based on the method signature of the top frame in the stack trace. There are two main scenarios of how bugs are linked to these deduplicated crash groups. Firstly, a single bug could trigger crashes in one or more crash groups. Conversely, in the second scenario, a group of bugs could be collectively triggering crashes in a single crash group. We find that the bug-fixing time for a bug from the second scenario is slightly shorter compared to a bug from the first scenario. This suggests that developers tend to address all related bugs to a crash group in one focused effort when those bugs are grouped together. If a new crash deduplication method could group crash reports such that the number of bugs belonging to the second scenario increases, that could reduce the overall bug-fixing time for developers.

To this end, we apply the FAST tool—a state-of-the-art crash deduplication method— to Mozilla crash reports, and we find that the estimated bug-fixing time for the bugs linked to clusters resulted by FAST shows a 0.2% reduction compared to that of the Mozilla’s default deduplication method. Then, a time-series clustering of the clusters resulting from FAST produces two main trends in the growth of the number of crash reports per cluster over time — one trend of consistent, steady growth, and the other of a roughly cubic shape, with a period of rapid, exponentially increasing growth followed by a stabilizing plateau. This distinction may allow developers to prioritize crash clusters that exhibit early escalation for quicker resolution. To explore the feasibility of predicting these trends in advance, we study the number of occurrences that are needed to detect the trend type that a cluster will follow. We find that, on average, 30 occurrences are necessary to predict the trend with 80% accuracy, and accumulating these 30 occurrences takes, on average, 37 days. While this is too slow to be of practical use, this paper lays the foundation for future work to shorten the time required to predict the correct time-series trend for newly emerging crash clusters.


To attend this presentation in person, please go to DC 2564. You can also attend virtually using Zoom.