Please note: This seminar will take place online.
Ananya Kumar, PhD candidate
Department of Computer Science, Stanford University
Machine learning systems are not robust—they suffer large drops in accuracy when deployed in different environments from what they were trained on. In this talk, I show that the foundation model paradigm—adapting models that are pretrained on broad unlabeled data—is a principled solution that leads to state-of-the-art robustness. I will focus on the key ingredients: how we should adapt and pretrain models for robustness. (1) First, I will show that the standard approach of adaptation (updating all the model’s parameters) can distort pretrained representations and perform poorly out-of-distribution. Our theoretical analysis leads to better methods for adaptation and state-of-the-art accuracies on ImageNet and in applications such as satellite remote sensing, wildlife conservation, and radiology. (2) Next, we will examine how to pretrain good representations from unlabeled data. I show that contrastive pretraining on unlabeled data (before adapting to labeled data in one domain) improves accuracy even on domains where we had no labels. We explain why pretraining works in a very different way from some classical domain adaptation intuitions—by connecting different domains instead of collapsing their representations together. Our theory predicts phenomena on real datasets, and leads to improved pretraining methods.
Bio: Ananya Kumar is a Ph.D. candidate in the Department of Computer Science at Stanford University, advised by Percy Liang and Tengyu Ma. His work focuses on representation learning, foundation models, and reliable machine learning. His papers have been recognized with several Spotlight and Oral presentations at NeurIPS, ICML, and ICLR, and his research is supported by a Stanford Graduate Fellowship.