Please note: This master’s thesis presentation will take place online.
Nasif Ahmed, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Mei Nagappan
Context: The pull-based development model is a widely adopted practice in distributed version control systems, particularly in open-source projects. In this model, contributors submit pull requests proposing changes to the codebase, which are then reviewed and potentially merged by project maintainers. Previous studies have extensively investigated the influence of different factors in merge outcome, aiming to generalize their impact across multiple projects.
Objective: This thesis takes a unique approach by examining these factors at the project level, aiming to understand how the influence of each factor varies across projects.
Methodology: To achieve this, we conducted a large-scale quantitative analysis on 841,399 pull requests from 1,100 GitHub projects. We constructed fixed-effect logistic regression models for each project and explored the correlations between different factors and merge outcomes.
Results: Our analysis indicates that the influence of factors varies across projects, both in terms of their order and direction. For example, while contributor experience is highly valued in many projects, it was found to be statistically insignificant in others. Likewise, the likelihood of a successful merge increases with the number of commits in some projects, whereas in others, it has the opposite effect. These findings have implications for both researchers and practitioners.