Please note: This master’s thesis presentation will take place in DC 3317.
Mohammad Mahdi Abdollahpour, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Patrick Lam
In today’s software development landscape, the use of third-party libraries is near-ubiquitous; leveraging third-party libraries can significantly accelerate development, allowing teams to implement complex functionalities without reinventing the wheel. However, one significant cost of reusing code is security vulnerabilities. Vulnerabilities in third-party libraries have allowed attackers to breach databases, conduct identity theft, steal sensitive user data, and launch mass phishing campaigns. Notorious examples of vulnerabilities in libraries from the past few years include log4shell, solarwinds, event-stream, lodash, and equifax.
Existing software composition analysis (SCA) tools track the propagation of vulnerabilities from libraries through dependencies to downstream clients and alert those clients. Due to their design, many existing tools are highly imprecise—they create alerts for clients even when the flagged vulnerabilities are not exploitable.
Library developers occasionally release new versions of their software with refactorings that improve modularity. In this work, we explore the impacts of modularity improvements on vulnerability detection. In addition to generally improving the nonfunctional properties of the code, refactoring also has several security-related beneficial side effects: (1) it improves the precision of existing (fast and stable) SCAs; and (2) it protects from vulnerabilities that are exploitable when the vulnerable code is present and not even reachable, as in gadget chain attacks.
Our primary contribution is thus to quantify, using a novel simulation-based counterfactual vulnerability analysis, two main ways that improved modularity can boost security. We propose a modularization method using a DAG partitioning algorithm, and statically measure properties of systems that we (synthetically) modularize. In our experiments, we find that modularization can improve precision of Software Composition Analysis (SCA) tools to 71%, up from 35%. Furthermore, migrating to modularized libraries results in 78% of clients no longer being vulnerable to attacks referencing inactive dependencies. We further verify that the results of our modularization reflect the structures that are already implicit in the projects (but for which no modularity boundaries are enforced).