An Industrial Case Study of Automatically Identifying Performance Regression Causes

Authors -

Thanh, H. D. Nguyen; Meiyappan, Nagappan; Ahmed, E. Hassan; Mohamed, Nasser and Parminder, Flora

Venue -

In Proceedings of the Practice Track at the 11th ACM/IEEE Working Conference on Mining Software Repositories (MSR 2014), Hyderabad, India, May 31 - June 1, 2014

Related Tags -

Abstract -

Even the addition of a single extra field or control statement in the source code of a large-scale software system can lead to performance regressions. Such regressions can considerably degrade the user experience. Working closely with the members of a performance engineering team, we observe that they face a major challenge in identifying the cause of a performance regression given the large number of performance counters (e.g., memory and CPU usage) that must be analyzed. We propose the mining of a regressioncauses repository (where the results of performance tests and causes of past regressions are stored) to assist the performance team in identifying the regression-cause of a newlyidentified regression. We evaluate our approach on an opensource system, and the commercial system for which the team is responsible. The results show that our approach can accurately (up to 80% accuracy) identify performance regression-causes using a reasonably small number of historical test runs (sometimes as few as four test runs per regression-cause).

Preprint -

PDF

BibTex -

@article{Nguyen2014,
 author = {Thanh, H. D. Nguyen and Meiyappan, Nagappan and Ahmed, E. Hassan and Mohamed, Nasser and Parminder, Flora},
 keyword = {Performance, Log File Analysis},
 title = {An Industrial Case Study of Automatically Identifying Performance Regression Causes},
 type = {conference},
 venue = {In Proceedings of the Practice Track at the 11th ACM/IEEE Working Conference on Mining Software Repositories (MSR 2014), Hyderabad, India, May 31 - June 1, 2014}
}