CS 782 Course Description | SCS | UW

[Please remove <h1>]


The Human Genome Project has revealed new possibilities for both the detection of genetic diseases and the design of highly effective pharmaceuticals. Attaining these goals will require computational algorithms that help to elucidate the intricate and very complex biological interactions within the cell. With such algorithms as the primary motivation, this course will study the pattern discovery techniques that are currently used to extract the functional knowledge hidden in biomolecular data that is derived from DNA, RAN, proteins and their reaction products.

The course will have a heavy orientation to current research papers in the field and students will be expected to present some critical assessments of existing research. Student projects will focus on the development of programs that work with various pattern discovery algorithms applied to online databases.


Pattern Discovery in Biomolecular Data, J.T.L. Wang, B.A. Shapiro, D. Shasha (Eds.), Oxford University Press, 1999.


3 hours of lectures per week.


Introduction (3 hrs)

Review of the central dogma: DNA, RNA, protein, cell systems.

DNA Sequences (8 hrs)

Sequence analysis, alignment techniques, the algorithmic significance method.

RNA Sequences (1 hr)

Features of RNA secondary structure.

RNA Folding and 3D Structure (3 hrs)

Structure prediction techniques, stochastic context free grammars, genetic algorithms.

Protein Sequences and Motif Discovery (4 hrs)

Issues in motif discovery, tools, sequence compariions for motif discovery.

Protein Secondary Structure (3 hrs)

Features of protein secondary structure, prediction techniques.

Protein 3D Structure (3 hrs)

Issues in protein folding, threading, structural comparisons for motif discovery.

Cellular Networks (3 hrs)

Analysis of expression data, inference techniques for gene regulatroy networks.

Student Presentations (8 hrs)