Please note: This seminar will take place in DC 2585.
Felix
Dangel,
Postdoctoral
Researcher
Vector
Institute
for
Artificial
Intelligence
Popular deep learning frameworks prioritize computing the average mini-batch gradient. Yet, other quantities such as its variance or many approximations to the Hessian can be computed efficiently, and at the same time as the gradient mean. They are of great interest to researchers and practitioners, but implementing them is often burdensome or inefficient.
To address this, I present an ecosystem on top of PyTorch which provides efficient and automated access. I will illustrate its usefulness for troubleshooting neural network training, and for enabling novel approaches to optimization in deep learning.
Bio: Felix Dangel is a Postdoctoral researcher at Vector. He is interested in the efficient computation of quantities beyond the gradient and their application to deep learning.
His PhD with Philipp Hennig at the University of Tübingen and the Max Planck Institute for Intelligent Systems focused on extending gradient backpropagation to efficiently extract higher-order geometrical and statistical information about the loss landscape of neural networks to improve their training and inspire novel algorithmic research.