Revised July 3, 2015
CS 482: Computational Techniques in Biological Sequence Analysis
General description
This course introduces the most well known bioinformatics problems and the algorithms behind their solutions. These problems include sequence alignment, large-scale sequence database search, evolutionary tree reconstruction, gene prediction, and protein sequencing. Students explore the underlying computational techniques and skills to solve similar problems.
Logistics
Audience
- Students taking the Bioinformatics option or students interested in learning how to apply mathematical modeling and algorithmic methods to solve biological problems. Usually taken in fourth year.
Normally available
- Winter
Related courses
- Pre-requisites: CS 341, STAT 241 or at least 60% in STAT 231
For official details, see the UW calendar.
Software/hardware used
- A personal computer for programming
Typical reference(s)
- R. Durbin, S. Eddy, A. Krogh and G. Mitchison, Biological Sequence Analysis, Cambridge Press, 1999
Required preparation
At the start of the course, students should be able to
- Program in Java, C++, or Python
- Design algorithms and analyze an algorithm's complexity
- Describe basic concepts in molecular biology or quickly learn the concepts in the first few weeks
Learning objectives
At the end of the course, students should be able to
- Find and use common bioinformatics resources and tools
- Apply the learned modeling and algorithmic techniques to solve computational problems in biology
- Apply the learned modeling and algorithmic techniques to solve data analysis problems in other areas
Typical syllabus
Introduction
- Brief review of the fundamentals of molecular biology and genetics in the context of biology as an information science.
Pairwise sequence alignment
- Classic dynamic programming ideas for pairwise sequence alignment
- Statistical measures of alignment significance
- Probabilistic models of homologous sequences
Heuristic sequence alignment
- Mathematical ideas underlying BLAST, FASTA and other heuristic sequence aligners
- Applications of sequence alignment
- Multiple alignment
Exact string matching
- Suffix trees, suffix arrays, and their application in pairwise and multiple alignment
Sequence annotation
- Gene finding
- Sequence feature detection
- Motif finding
- Hidden Markov models
Evolutionary tree algorithms
- Classical and contemporary algorithms for inferring evolutionary trees
Protein sequence identification
- Mass spectrometry and its application in protein identification and sequencing