PhD Seminar • Bioinformatics • Anti-Cancer Peptides Identification and Activity Type Classification with Protein Sequence Pre-training

Monday, February 12, 2024 1:00 pm - 2:00 pm EST (GMT -05:00)

Please note: This PhD seminar will take place online.

Shaokai Wang, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Bin Ma

Cancer remains a significant global health challenge, responsible for millions of deaths annually. Addressing this issue necessitates the discovery of novel anti-cancer drugs. Anti-cancer peptides (ACPs), with their unique ability to selectively target cancer cells, offer new hope in discovering low side-effect anti-cancer drugs. However, the process of discovering novel ACPs is both time-consuming and costly. Therefore, there is an urgent need for a computational method that can predict whether a given peptide is an ACP and classify its specific functional types. In this paper, we introduce DUO-ACP, a model serving dual roles in ACP prediction: identification and functional type classification.

DUO-ACP employs two embedding modules to acquire knowledge about global protein features and local ACP characteristics, complemented by a prediction module. When assessed on two publicly available datasets for each task, DUO-ACP surpasses all existing methods, achieving outstanding results: an ACP identification accuracy of 89.5% and a Macro-averaged AUC of 88.6% in ACP functional type classification.

We further interpret the contribution of each part of our model, including the two types of embeddings as well as ensemble learning. On a new curated dataset, the prediction results of DUO-ACP closely match existing literature, highlighting DUO-ACP’s generalization capabilities on previously unseen data, and displaying the potential capability of discovering novel ACP.