Please note: This seminar will take place in DC 1304 and virtually over Zoom.
Yuntian
Deng,
PhD
candidate
Harvard
John
A.
Paulson
School
of
Engineering
and
Applied
Sciences
The field of text generation has seen significant progress in recent years. We are approaching a future where ubiquitous text generation technologies will allow us to generate long-form texts that are not only fluent at a surface level, but also coherent in their overall structure. To achieve this future, I work on evaluating and improving structure modeling in language models.
In the first part of this talk, I will introduce a method for quantifying structural coherence in language models. This method utilizes a critic to extract structures by projecting data to a latent space of interest, then compares the structures of model generations to real data. This quantitative measure of structural coherence allows us to identify structural issues in language models.
In the second part of the talk, I will present my research on improving structure modeling in language models, based on insights gained from evaluating structural coherence. Specifically, I will demonstrate how utilizing a critic to guide the generative process can improve the coherence of the generated text. In closing, I will discuss potential future directions of long-form text generation, including its applications in areas such as creative writing, multi-modal generation, and genome modeling.
Bio: Yuntian Deng is a PhD student at Harvard University, advised by Prof. Alexander Rush and Prof. Stuart Shieber. His research focuses on developing long-form text generation methods that are coherent, transparent, and efficient. He is also a key contributor to several open-source projects, including OpenNMT, image-to-LaTeX, and LaTeX-to-image.
Yuntian is the recipient of an Nvidia Fellowship, a Baidu Fellowship, and multiple awards for his research, including the University of Chicago Rising Stars in Data Science, the ACL 2017 Best Demo Paper Runner-Up, the ACM Gordon Bell Special Prize for Covid Research, and the DAC 2020 Best Paper.
To attend this seminar in person, please go to DC 1304. You can also attend virtually using Zoom at https://uwaterloo.zoom.us/j/92750782689.
For those attending virtually: The passcode will be provided by email on Friday before the seminar as well on the morning of the seminar.