Meet Victor Zhong, a computer scientist who seeks to make machine learning more general and efficient through natural language understanding

Friday, November 15, 2024

Victor Zhong joined the Cheriton School of Computer Science as a tenure-track Assistant Professor in August 2024. He also serves as a CIFAR AI Chair and faculty member at the Vector Institute.

Professor Zhong has a PhD in Computer Science, specializing in natural language processing, from the University of Washington, a Master of Science degree in Computer Science from Stanford University, and a Bachelor of Applied Science in Computer Engineering from the University of Toronto.

His research is at the intersection of machine learning and natural language processing, with an emphasis on using language understanding to learn more generally and efficiently. His research covers a range of topics, including dialogue, code generation, question answering, and grounded reinforcement learning.

Professor Zhong has been awarded an Apple AI/ML Fellowship as well as an EMNLP Outstanding Paper award. His work has been featured in Wired, MIT Technology Review, TechCrunch, VentureBeat, Fast Company, and Quanta Magazine. He was a founding member of Salesforce Research and has previously worked at Microsoft Research, Meta AI Research and Google Brain.

As of November 2024, his research papers have been cited collectively more than 7,300 times with an h-index of 25 according to Google Scholar.

What follows is a lightly edited transcript of a conversation with Professor Zhong, where he discusses his research, his advice for aspiring computer scientists, and what excites him about joining the Cheriton School of Computer Science.

Computer Science Professor Victor Zhong in DC atrium

What challenges in machine learning and NLP do you find most exciting to tackle?

Machine learning has made incredible, real-world progress in the last few decades. ML models are recognizable in many services and products we use daily, from search engines to chat agents to personalized media recommendation systems. ML is also used widely in many services and products we might not associate with AI such as software schedulers, network routers, insurance pricing, and medical care prioritization.

What’s particularly fascinating about the prevalence of ML is that it has achieved such ubiquity while making a drastic and occasionally unfounded assumption: that the training and test examples are independent and come from the same, identically distributed distribution.

A perhaps oversimplified summary of standard ML goes like this: I assume that the test data and training data come from the same underlying distribution. So, when I train, I collect a large number of examples, which I combine into a dataset that approximates the underlying distribution. I then fit a model on this dataset, say, using maximum likelihood estimation. I then deploy this model into production and use it to make predictions on test data.

But this is clearly not how the real world works.

For instance, if I train a self-driving car in San Francisco, I cannot deploy the driving program directly in London, England because the rules of the road are different. Similarly, if I design an automatic controller to make espresso for a Breville machine, I cannot install it onto my Gaggia and expect it to work because the interfaces and physics of the two machines differ. If I train a natural language interface for a SQL database I created at the university, I cannot use it directly to answer questions at the Salesforce database used by a bakery down the street because the semantics of the databases and the database engines are different.

I am excited about moving machine learning beyond the realm of independent, identically distributed train/test sets. In other words, how can we design machine learning systems that adapt efficiently in real-time after deployment?

Tell us a bit about your research.

My group, the Reading to Learn Lab at the University of Waterloo, explores how we can improve ML efficiency and enhance generalization using language understanding. Most ML models train on vast amounts of labelled data or experience for a specific task. When the problems change — for example, driving in a new country, operating a new coffee machine, using a language interface for a new database — the solution we trained no longer generalizes to new problems. In contrast, the strength of humans lies in our ability to adapt to new problems adeptly through reading. Understanding the traffic rules of a new country, the workings of a new coffee machine, or processing the content of a new database can be accomplished by reading the manual.

The thesis our research is this: by reading language specifications that characterize key aspects of the problem, we can efficiently learn solutions that generalize to new problems. Our current work spans several areas.

First, we investigate novel methods to learn from human and machine language feedback, leveraging world knowledge provided by humans and foundation models. Second, we focus on developing semantic evaluations of foundation models, proposing automatic evaluation methods to comprehensively assess model capabilities without requiring domain expertise. Third, we develop post-training adaptation techniques that enable ML models to adapt effectively and privately to new distributions and contexts during test time. Looking further ahead, we are developing language agents in operating systems and techniques for learning from structured and unstructured multi-modal data.

What do you see as your most significant contribution?

My work is among some of the first to examine out-of-distribution generalization via language understanding. In 2017, we developed one of the first natural language interfaces that generalizes to new, unseen data tables by interpreting table schema. This line of work has since been extended to new databases, professional tool usage, and new problems faced by operating system agents.

These works are very much driven by real-world applications.

On the other hand, another more conceptual research that I am very fond of is our 2020 work that conducted one of the first careful studies of generalization to new environment dynamics by interpreting manuals. I am especially fond of this work because of how challenging it was and how uncertain the endeavour was.

We had no testbeds, no methods, no reference examples. Everything was designed, implemented, and tested from scratch. It was a wonderful time for me in London, both professionally and personally, working with what was at the time a newly-minted Facebook AI Research group comprised of several researchers whose work I had long admired. Looking back, this work became the foundation of my subsequent PhD work and my current research.

What advice would you give to students interested in pursuing research in your area?

I want to preface by noting that I work in empirical research, where experimental evidence is king. An ideal student, and to some extend an ideal scientist, is curious, enjoys scientific rigour, and most importantly is an optimist. There are many different perspectives on how one should live their life. They are probably all legitimate. If one wants to pursue research, they should consider why they want to do so.

First, academic measures of success such as citations, credentials and awards can be arbitrary. I don’t believe that in isolation they are good reasons for pursuing research. This is why a student need to be curious. They are doing research because they want to know and discover. Second, experiment design is difficult. A hastily designed experiment can yield results irrelevant to the hypothesis, and in the worst case yield the wrong conclusions that mislead. This is why I think the student needs to enjoy scientific rigour — they are happy to put in the work because they enjoy the process. Finally, science is unpredictable. We primarily ask questions for which we do not know the answers. Working in this degree of uncertainty for years can be mentally exhausting.

This is why I think the student needs to be an optimist at heart. They should believe that the science is important, that it can succeed, and that once it does succeed it will make the world a better place. If the student agrees with this general sentiment, they should seek to get involved as early as possible. They will likely face many rejections. This is normal. But it will be OK because they are an optimist.

Do you see opportunities for collaborative research at the Cheriton School of Computer Science?

We are a large, vibrant department — one of the best in the world — with many world-class colleagues with expertise complementary to my own. My research covers a broad range of applications. They involve natural language semantic parsing to query structured and unstructured data, core machine learning topics in post-training adaptation and learning from feedback, as well as theoretical questions in specifying information in natural language. I am exploring collaborative opportunities with the Natural Language Processing Group, the Data Systems Group, and broadly in Artificial Intelligence.

What aspect of joining the Cheriton School of Computer Science excites you the most?

From a career perspective, I am excited to join the department for the expertise of our colleagues, the strength of our students, and the vibrant and supportive nature of our department.

From a personal perspective, I am excited to be back in Canada. I had a wonderful time growing up in Vancouver and later going to school in Toronto. I am grateful for this opportunity, and I look forward to contributing to the best of my abilities to make the department, region, and country better.

Who has inspired you most in your career?

I have had exceptional mentors over the years. Richard Socher and my master’s advisor, Chris Manning, were foundational to my career, for taking a chance on me when I had no credentials or experience. My PhD advisor, Luke Zettlemoyer, remains a personal inspiration, for his curiosity, rigour, and optimism not just in science but also in life. I especially thank Chris and Luke for showing me that, first, one can be an incredibly accomplished scientist and yet be humble and open-minded to new ideas. And, second, one can be an incredibly productive professional and yet be an amazingly supportive husband, father, and friend.

On a lighter note, my research perspective is largely summarized by Kevin Knight in a quip to a somewhat pretentious question. This particular event interviewed several famous, high-profile, and accomplished scientists and asked them what AI researchers should work on and what they should avoid. Instead of giving a prescriptive monologue, Kevin replied, “I think people should work on whatever they want.”

I believe that science is ultimately a macro process. As long as the work is done with care and with purpose, we’ll be in good shape.

What do you do in your spare time?

I love to travel. I love local walking tours, exploring restaurants as well as breweries, and city history museums. I also enjoy photography, especially when combined with travelling. Finally, I enjoy playing and listening to music. I was a jazz trombonist for around 10 years. The trombone is not a very neighbour-friendly instrument, so these days I stick to guitar.