CS 482 Computational Techniques in Biological Sequence Analysis


Objectives

This course introduces the algorithms and CS ideas used in understanding biological sequences. Whereas BIOL 365 introduced applications of computational techniques in genome analysis, this course focuses on their underlying algorithmic ideas.

Intended Audience

CS 482 is intended for students in the Bioinformatics plan in fourth year.

Related Courses

Prerequisites: BIOL 365, CM 339/CS 341, STAT 241 or at least 60% in STAT 231.

References

Biological Sequence Analysis, by R. Durbin, S. Eddy, A. Krogh and G. Mitchison, Cambridge Press, 1999 (Required)

Schedule

3 hours of lectures per week. Normally available in Winter.

Outline

Introduction (3 hours)

Review of the fundamentals of molecular biology and genetics, in context of biology as an information science.

Pairwise Sequence Alignment (4 hours)

Classic dynamic programming ideas for pairwise alignment. Statistical measures of alignment significance. Probabilistic models of homologous sequences.

Heuristic Sequence Alignment (5 hours)

Mathematical ideas underlying BLAST, FASTA and other heuristic sequence aligners. Applications of sequence alignment. Multiple alignment.

Exact string matching (5 hours)

Suffix trees and their application in pairwise and multiple alignment.

Sequence annotation (8 hours)

Gene finding. Sequence feature detection. Motif finding. Hidden Markov models and their extensions. Contemporary algorithms in understanding sequence features.

Evolutionary tree algorithms (5 hours)

Classical and contemporary algorithms for inferring evolutionary trees. Parsimony, Neighbour joining, and statistical methods. Efficient heuristics. The relevance of phylogenetics on contemporary bioinformatics.

Current Topics (6 hours)