Master’s Thesis Presentation • Machine Learning — Wasserstein Autoencoders with Mixture of Gaussian Priors for Stylized Text Generation

Thursday, January 21, 2021 10:00 am - 10:00 am EST (GMT -05:00)

Please note: This master’s thesis presentation will be given online.

Amirpasha Ghabussi, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Olga Vechtomova

Probabilistic text generation is an important application of Natural Language Processing (NLP). Variational autoencoders and Wasserstein autoencoders are two widely used methods for text generation. New research efforts focus on improving the quality of the generated samples for these two methods. While Wasserstein autoencoders are effective for text generation, they are unable to control the topic of generated text, even when the training dataset has samples from multiple categories with different styles.

We present a semi-supervised approach using Wasserstein autoencoders and a mixture of Gaussian priors for topic-aware sentence generation. Our model is trained on a multi-class dataset and generates sentences in the style/topic of a desired class. It is also capable of interpolating multiple classes.

Moreover, we can train our model on relatively small datasets. While a regular WAE or VAE cannot generate diverse sentences with few training samples, our approach generates diverse sentences and preserves the style and the content of the desired classes.


To join this master’s thesis presentation on MS Teams, please go to https://teams.microsoft.com/l/meetup-join/19%3ameeting_OWFkOWFmYTUtYWVjNS00M2Q3LThhODctZjY0ZTgxMjI2MWFh%40thread.v2/0?context=%7b%22Tid%22%3a%22723a5a87-f39a-4a22-9247-3fc240c01396%22%2c%22Oid%22%3a%223555791e-2942-4f03-a606-9a4b25bb8f4c%22%7d.