Please note: This seminar will be given online.
Aukosh
Jagannath, Department
of
Statistics
and
Actuarial
Science
University
of
Waterloo
I will report on recent work with G. Ben Arous (NYU) and R. Gheissari (Berkeley) on the performance of online stochastic gradient in high-dimensional inference tasks. We develop a classification of loss landscapes for the difficulty of such problems, namely whether for a given loss function and typical realizations of the data, SGD requires linear, quasilinear, or polynomially many samples in the dimension to perform the inference task. This classification depends on an intrinsic property of the population loss which we call the “information exponent” as opposed to almost sure properties of the loss landscape (e.g., quasi-convexity saddle-type properties). We find that from uniform at random starts, the majority of the data is used in the initial “search” phase (where the landscape is highly non-convex) as compared to the final “descent” phase (where the algorithm is in a trust region).
In this talk, I will illustrate our methods on a simple class of problems, namely supervised learning with a single-layer network in the case of Gaussian patterns. Here we obtain a classification of the sample complexity as one varies the activation function. If there is time remaining, I will illustrate how this approach can be extended to analyze gradient descent in the case of in Tensor PCA.
To join this seminar on Zoom, please go to https://zoom.us/j/92937099244.