Master’s Thesis Presentation • Software Engineering — Evaluating the Effectiveness of Code2Vec for Bug Prediction When Considering That Not All Bugs Are the SameExport this event to calendar

Friday, September 4, 2020 — 2:00 PM EDT

Please note: This master’s thesis presentation will be given online.

Kilby Baron, Master’s candidate
David R. Cheriton School of Computer Science

Supervisors: Professors Mike Godfrey and Mei Nagappan

Bug prediction is an area of research focused on predicting where in a software project future bugs will occur. The purpose of bug prediction models is to help companies spend their quality assurance resources more efficiently by prioritizing the testing of the most defect prone entities. However, almost every bug prediction model makes the same oversight which dramatically reduces their practical utility: they treat all bugs as the same. Bug prediction models are only concerned with predicting whether an entity has a bug, or how many bugs an entity will have, which implies that all bugs have the same importance. In reality, bugs can have vastly different origin, impacts, priorities, and costs; therefore, for a bug prediction model to be useful in practice, it should give an indication of which bugs to prioritize based on an organization’s needs.

This paper evaluates a possible method for predicting bug attributes related to cost by analyzing over 33,000 bugs from 11 different projects. If bug attributes related to cost can be predicted, then bug prediction models can use the approach to improve the granularity of their results. The cost metrics in this study are bug priority, the experience of the developer who fixed the bug, and the size of the bug fix. First, it is shown that bugs differ along each cost metric, and prioritizing buggy entities along each of these metrics will produce very different results. We then evaluate two methods of predicting cost metrics: traditional deep learning models, and semantic learning models. The results of the analysis find evidence that traditional independent variables show potential as predictors of cost metrics. The semantic learning model was not as successful, but may show more effectiveness in future iterations.


To join this master’s thesis presentation virtually on Zoom, please go to https://zoom.us/j/94264857048?pwd=djJYdi9VNUtTZ3Q1U0ZmaGRERnAvUT09.

Location 
Online presentation
200 University Avenue West

Waterloo, ON N2L 3G1
Canada

S M T W T F S
27
28
29
30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  1. 2020 (190)
    1. December (1)
    2. November (2)
    3. October (7)
    4. September (21)
    5. August (28)
    6. July (14)
    7. June (18)
    8. May (16)
    9. April (20)
    10. March (16)
    11. February (25)
    12. January (22)
  2. 2019 (255)
    1. December (21)
    2. November (25)
    3. October (16)
    4. September (20)
    5. August (18)
    6. July (12)
    7. June (23)
    8. May (23)
    9. April (32)
    10. March (25)
    11. February (16)
    12. January (24)
  3. 2018 (217)
  4. 2017 (36)
  5. 2016 (21)
  6. 2015 (36)
  7. 2014 (33)
  8. 2013 (23)
  9. 2012 (4)
  10. 2011 (1)
  11. 2010 (1)
  12. 2009 (1)
  13. 2008 (1)