CS 482 Computational Techniques in Biological Sequence Analysis
Objectives
This course introduces the algorithms and CS ideas used in understanding biological sequences. Whereas BIOL 365 introduced applications of computational techniques in genome analysis, this course focuses on their underlying algorithmic ideas.
Intended Audience
CS 482 is intended for students in the Bioinformatics plan in fourth year.
Related Courses
Prerequisites: BIOL 365, CM 339/CS 341, STAT 241 or at least 60% in STAT 231.
References
Biological Sequence Analysis, by R. Durbin, S. Eddy, A. Krogh and G. Mitchison, Cambridge Press, 1999 (Required)
Schedule
3 hours of lectures per week. Normally available in Winter.
Outline
Introduction (3 hours)
Review of the fundamentals of molecular biology and genetics, in context of biology as an information science.
Pairwise Sequence Alignment (4 hours)
Classic dynamic programming ideas for pairwise alignment. Statistical measures of alignment significance. Probabilistic models of homologous sequences.
Heuristic Sequence Alignment (5 hours)
Mathematical ideas underlying BLAST, FASTA and other heuristic sequence aligners. Applications of sequence alignment. Multiple alignment.
Exact string matching (5 hours)
Suffix trees and their application in pairwise and multiple alignment.
Sequence annotation (8 hours)
Gene finding. Sequence feature detection. Motif finding. Hidden Markov models and their extensions. Contemporary algorithms in understanding sequence features.
Evolutionary tree algorithms (5 hours)
Classical and contemporary algorithms for inferring evolutionary trees. Parsimony, Neighbour joining, and statistical methods. Efficient heuristics. The relevance of phylogenetics on contemporary bioinformatics.