Beyond Gradient Descent for Regularized Segmentation Losses

Dmitrii Marin, Meng Tang, Ismail Ben Ayed, Yuri Boykov

In IEEE conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, 2019.


The simplicity of gradient descent (GD) made it the default method for training ever-deeper and complex neural networks. Both loss functions and architectures are often explicitly tuned to be amenable to this basic local optimization. In the context of weakly-supervised CNN segmentation, we demonstrate a well-motivated loss function where an alternative optimizer (ADM) achieves the state-of-the-art while GD performs poorly. Interestingly, GD obtains its best result for a "smoother" tuning of the loss function. The results are consistent across different network architectures. Our loss is motivated by well-understood MRF/CRF regularization models in "shallow" segmentation and their known global solvers. Our work suggests that network design/training should pay more attention to optimization methods.

WHOLE PAPER: pdf file (3.2Mb)
RELATED PAPER: On Regularized Losses for Weakly-supervised CNN segmentation (ECCV 2018)

[an error occurred while processing this directive]