Freda Shi

石昊悦

Greetings! I am a final-year Ph.D. student at the Toyota Technological Institute at Chicago. I am grateful to be advised by Professors Karen Livescu and Kevin Gimpel, and to be supported by a Google Ph.D. Fellowship since Autumn 2021. I am currently a visiting student at the MIT Department of Brain and Cognitive Sciences, hosted by Professor Roger Levy. I completed my Bachelor's degree in Intelligence Science and Technology (Computer Science Track) in 2018 at Peking University, with a minor in Sociology.

News

05/2023: I have accepted a position as Assistant Professor (starting Summer 2024) in the David R. Cheriton School of Computer Science at the University of Waterloo and a Faculty Member at the Vector Institute.
Prospective Graduate Students: The FAQ page and my advising statement may answer some of your questions, and I appreciate it if you read them before reaching out.
Prospective Undergrad RAs and Visiting Students: Please complete a practice task to demonstrate your interest and skills, and submit your application here. Due to bandwidth limitation, I am sorry that I am not able to reply any email regarding internship application if you have not completed a practice task.
05/2024: Talk at Boston University.
02/2024: Talk at the Vector NLP workshop. Check out the slides.
10/2023: Talk at the University of Michgan, Ann Arbor. Check out the slides.
09/2023: Talk at Peking University. Content covered in this talk largely overlaps with my academic job talk in spring 2023 and (forthcoming) thesis. Check out the slides.

Research Interests

My research interests are in computational linguistics and natural language processing, and I am particularly interested in learning language through grounding, computational multilingualism and related topics. Representative work includes the grounded syntax and semantics learners, the contextualized bilingual lexicon inducer, and the substructure-based zero-shot cross-lingual dependency parser. Recently, I have also worked on analyzing pre-trained large language models from the views of cross-lingual performance, distractability, and semantic parsing (in a broad sense). For more details, check out my research topics and academic c.v.

Teaching

TTIC 31190 (Autumn 23): Natural Language Processing

Publications show selected / show all by date / show all by topic

Topics: Syntax / Semantics / Multilingualism / Others (*: Equal Contribution)

News

Research Interests

Teaching

Publications show selected / show all by date / show all by topic

Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing

Freda Shi, Kevin Gimpel, Karen Livescu

Audio-Visual Neural Syntax Acquisition

Cheng-I Jeff Lai*, Freda Shi*, Puyuan Peng*, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Kaustubh D. Dhole et al.

Large Language Models Can Be Easily Distracted by Irrelevant Context

Freda Shi*, Xinyun Chen*, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, Denny Zhou

Language Models are Multilingual Chain-of-Thought Reasoners

Freda Shi*, Mirac Suzgun*, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

InCoder: A Generative Model for Code Infilling and Synthesis

Daniel Fried*, Armen Aghajanyan*, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis

Natural Language to Code Translation with Execution

Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, Sida I. Wang

Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing

Freda Shi, Kevin Gimpel, Karen Livescu

Deep Clustering of Text Representations for Supervision-Free Probing of Syntax

Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

Grammar-Based Grounded Lexicon Learning

Jiayuan Mao, Haoyue Shi, Jiajun Wu, Roger P. Levy, Joshua B. Tenenbaum

Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment

Haoyue Shi, Luke Zettlemoyer, Sida I. Wang

Substructure Substitution: Structured Data Augmentation for NLP

Haoyue Shi, Karen Livescu, Kevin Gimpel

On the Role of Supervision in Unsupervised Constituency Parsing

Haoyue Shi, Karen Livescu, Kevin Gimpel

A Cross-Task Analysis of Text Span Representations

Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu, Kevin Gimpel

Visually Grounded Neural Syntax Acquisition

Haoyue Shi*, Jiayuan Mao*, Kevin Gimpel, Karen Livescu

On Tree-Based Neural Sentence Modeling

Haoyue Shi, Hao Zhou, Jiaze Chen, Lei Li

On Multi-Sense Word Embeddings via Matrix Factorization and Matrix Transformation

Haoyue Shi

Learning Visually-Grounded Semantics from Contrastive Adversarial Samples

Haoyue Shi*, Jiayuan Mao*, Tete Xiao*, Yuning Jiang, Jian Sun

Constructing High Quality Sense-Specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-Sense

Haoyue Shi, Xihao Wang, Yuqi Sun, Junfeng Hu

Joint Saliency Estimation and Matching using Image Regions for Geo-Localization of Online Video

Haoyue Shi, Jia Chen, Alexander G. Hauptmann

Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation

Haoyue Shi, Caihua Li, Junfeng Hu

Cheng-I Jeff Lai, Freda Shi, Puyuan Peng*, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass

Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, Denny Zhou

Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis

Haoyue Shi, Jiayuan Mao, Kevin Gimpel, Karen Livescu

Haoyue Shi, Jiayuan Mao, Tete Xiao*, Yuning Jiang, Jian Sun