Please note: This master’s thesis presentation will take place online.
Mingyang Yin, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Shane McIntosh
Continuous Integration (CI) provides a feedback loop for the change sets that developers produce. It is crucial that CI processes change sets quickly to provide timely feedback to developers and enable teams to release software updates rapidly. Prior work has made several advances in proposing automated approaches to speed up CI builds. While these approaches have been broadly adopted, CI platforms are flexible enough to enable teams to produce custom strategies to optimize or omit unnecessary or redundant tasks (i.e., developer-applied accelerations). Exploring developer-applied accelerations and identifying recurrent patterns within them may enable broader reuse and can inform recommendations to enhance software development efficiency.
In this thesis, we set out to detect and catalog developer-applied CI accelerations. First, we propose clustering, rule-based, and ensemble approaches to detect developer-applied accelerations in a dataset of 2,896 CircleCI build jobs, which achieve an F1-score of up to 0.64. We then conduct a qualitative analysis of the detected developer-applied accelerations to create a detailed catalog of 14 patterns spanning four categories of purposes, 16 patterns spanning five categories of mechanisms, and three categories of magnitudes, from which we infer actionable implications for both the consumers and the providers of CI platforms. Developers can leverage our identified patterns to audit their CI pipelines for inefficiencies, such as redundant invocations of costly external services and rebuilds triggered by minor corrections. Additionally, developers can use our identified patterns to create templates that detect non-impactful changes to specific files, such as .yml and .json.