Please note: This PhD seminar will take place online.
Zeping Mao, PhD candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Ming Li
Autoregressive models have achieved strong performance in sequence prediction tasks, but they often struggle when correct test-time solutions require flexible search beyond patterns explicitly observed during training. This issue becomes particularly important in scientific discovery problems, where the target space may contain meaningful but previously unseen structures.
In this talk, I will present our recent RNovA work through this broader machine learning perspective. We study de novo peptide sequencing from tandem mass spectra, with a focus on open post-translational modification (PTM) discovery, where the model must generalize to modifications not specified during training. To address this challenge, we introduce a reinforcement-learning-style training strategy that encourages more flexible search while maintaining the stability and accuracy of autoregressive sequence prediction. Conceptually, this framework can be viewed as a way to refine standard autoregressive generation with finer-grained search guidance, somewhat analogous in spirit to how diffusion-style refinement improves generation through iterative correction. Empirically, our method not only enables zero-shot open PTM discovery, but also preserves state-of-the-art performance on conventional de novo peptide sequencing over the standard 20 amino acids. More broadly, this work suggests that reinforcement-learning-style refinement may offer a general strategy for extending autoregressive models to settings where exact targets at inference time are only partially represented in the training data.