Please note: This seminar will be given online.
Wenhu
Chen, Department
of
Computer
Science
University
of
California,
Santa
Barbara
One of the ultimate goals of artificial intelligence is to build a knowledgeable virtual assistant that can understand natural language inputs and seek over the Web to provide accurate information to humans. Building a virtual assistant requires two steps: 1) understanding human input and reasoning over the massive Web knowledge to derive the supporting knowledge, 2) grounding on the supporting knowledge to generate natural language responses to communicate with humans.
In the first part of the talk, I will cover how to understand human language to reason across the Web knowledge. Specifically, I will first discuss how to answer simple relation/entity-centric questions by deducing soft logical rules from structured Web knowledge (Freebase, YAGO). Then I will discuss how to answer more complex multi-hop questions by reasoning across heterogeneous structured and unstructured Web knowledge with an efficient retriever-reader framework. These models can accurately navigate through the Web to derive supporting knowledge. In the second part, I will describe how to generate faithful natural language responses by grounding on this supporting (structured) knowledge. Specifically, I propose a novel knowledge-grounded pre-training algorithm by leveraging unlabeled data from the Web to train a generation model. The pre-trained generation model can produce very precise natural language responses, greatly outperforming the existing GPT-2 in terms of faithfulness.
Finally, I will conclude my talk by proposing future directions for knowledge-grounded natural language processing.
Bio: Wenhu Chen is a fourth-year Ph.D. student at the University of California, Santa Barbara, advised by William Yang Wang and Xifeng Yan. His research interest covers natural language processing, deep learning, knowledge representation. Specifically, he aims at developing models that can ground and reason over external world knowledge to understand human language and communicate with humans. He is also interested in multi-modal problems like visual question answering and image/video captioning.
He has interned in multiple companies including Google Research, Microsoft AI & Research, Samsung Research America, eBay Research. He publishes and serves as Program Committee for ACL, NAACL, EMNLP, ICLR, NeurIPS, and CVPR. He was recognized as the top reviewer in NeurIPS 2019. He received the WACV best-student paper honorable mention in 2021.
To join this seminar on Zoom, please go to https://zoom.us/j/94365649914?pwd=K3VXaTFIMDUvZ1grZzhUTjFMTS94dz09.