Please note: This master’s thesis presentation will take place online.
Xingye Fan, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Yuri Boykov
Segment size, which one may equivalently refer to as volume, area, or cardinality, is uniquely defined by segmentation, e.g., by aggregating pixel-level predictions. On the other hand, size provides only a very weak constraint on segmentation. However, as this thesis observes, explicit size constraints are powerful cues for training segmentation models in weakly supervised settings, e.g. when only image-level class tags are provided or a fraction of pixels is labeled. We also observe that some standard unsupervised losses may have implicit size bias resulting in notable segmentation artifacts.
This thesis addresses three closely related problems regarding size constraints in segmentation and size predictions. First, we propose explicit size targets for training segmentation models without ground truth masks. We show approximate size targets predicted by human annotators result in segmentation quality on par with full pixel-precise supervision. The second contribution of this thesis is to show implicit size bias in standard unsupervised segmentation losses common in scribble supervision, e.g. mutual information or the Potts model. We show that this bias leads to the performance collapse as the amount of scribbles decreases. In contrast, our size-target supervision works well without any scribbles. Lastly, inspired by the enhanced segmentation outcomes achieved through size-target supervision, we explore the potential of deep models in predicting sizes directly.