Please note: This seminar will be given online.
Eugene Ndiaye, Tennenbaum President’s Postdoctoral Fellow
H. Milton Stewart School of Industrial and Systems Engineering
Georgia Institute of Technology
If you predict a label y of a new object with yˆ, how confident are you that “y = ˆy”? The conformal prediction method provides an elegant framework for answering such a question by establishing a confidence set for an unobserved response of a feature vector based on previous similar observations of responses and features. This is performed without assumptions about the distribution of the data. While providing strong coverage guarantees, computing conformal prediction sets requires adjusting a model to an augmented dataset considering all possible values that the unobserved response can take, and proceeding to select the most likely ones. For a regression problem where y is a continuous variable, it typically requires an infinite number of model fits; which is usually infeasible. In general, the computation of such distribution-free confidence sets is still considered an open problem. I advocate a simplification and propose to settle for an accurate approximation of these confidence sets given a prescribed precision. To do so, I will introduce (hopefully) weak regularity assumptions on the prediction model, in order to obtain feasible calculations and demonstrate how one can effectively and efficiently quantify the prediction uncertainty for several machine learning algorithms.
Bio: Eugene Ndiaye is a Tennenbaum President’s Postdoctoral Fellow in the H. Milton Stewart School of Industrial and Systems Engineering at Georgia Institute of Technology, hosted by Xiaoming Huo and Pascal Van Hentenryck. Previously, he was a Postdoctoral Researcher at the RIKEN Center for Advanced Intelligence Project (AIP) in the data-driven biomedical science team led by Ichiro Takeuchi. He did a PhD in Applied Mathematics under the supervision of Olivier Fercoq and Joseph Salmon at Télécom ParisTech (now part of the Institut Polytechnique de Paris). His PhD thesis focused on the design and analysis of faster and safer optimization algorithms for variable selection and hyperparameter calibration. His current and forthcoming work focuses on the efficient construction of confidence regions with minimal assumptions on the distribution of the data as well as the analysis of selection biases in optimization algorithms for machine learning.