Please note: This PhD seminar will take place in DC 3317.
Zhongwen
Zhang,
PhD
candidate
David
R.
Cheriton
School
of
Computer
Science
Supervisor: Professor Yuri Boykov
We propose “collision cross-entropy” as a robust alternative to the Shannon’s cross-entropy in the context of self-labeled classification with posterior models. Assuming unlabeled data, self-labeling works by estimating latent pseudo-labels, categorical distributions y, that optimize some discriminative clustering criteria, e.g., “decisiveness” and “fairness”. All existing self-labeled losses incorporate Shannon’s cross-entropy term targeting the model prediction, softmax σ, at the estimated distribution y. In fact, σ is trained to mimic the uncertainty in y exactly. Instead, we propose the negative log-likelihood of “collision” to maximize the probability of equality between two random variables represented by distributions σ and y.
We show that our loss satisfies some properties of a generalized cross-entropy. Interestingly, it agrees with the Shannon’s cross-entropy for one-hot pseudo-labels y, but the training from softer labels weakens. For example, if y is a uniform distribution at some data point, it has zero contribution to the training. Our self-labeling loss combining collision cross entropy with basic clustering criteria is convex w.r.t. pseudo-labels, but non-trivial to optimize over the probability simplex. We derive a practical EM algorithm optimizing pseudo-labels y significantly faster than generic methods, e.g., the projectile gradient descent. The collision cross-entropy consistently improves the results on multiple self-labeled clustering examples using different DNNs.