CS 798: Kolmogorov complexity and
its applications
Fall 2007
Home 

Home
 Fun readings
Instructor: 
Ming
Li 
DC 3355, x4659 
mli@uwaterloo.ca 
Course time:  Mondays 4:307:00pm DC 3313 
Office hours:  Mondays 1:303:30pm or just
drop by my office any time. 
Textbook:  Ming Li and Paul Vitanyi, An
introduction to Kolmogorov
complexity and its applications. 2nd edition, Springer,
1997 
The course material
will be mainly from our textbook and papers that will be provided
at this website. The students will do a small research project at
the end of the term, and present them in class.
Note that this course is not a bioinformatics course. It is a
general course on Kolmogorov complexity and many of its applications.
Kolmogorov complexity is a modern theory of information and randomness.
The goal of this course is to teach the students the basic concepts of
Kolmogorov complexity and how to use this tool in their own research.
Topics include: plain Kolmogorov complexity, randomness,
prefix Kolmogorov complexity, incompressibility method,
information distance, applications in various fields ranging from
averagecase analysis of algorithms to bioinformatics and from
document comparison to internet search. We will not be able to cover
the complete textbook, rather we will focus more on recent
developments and especially on applications of Kolmogorov
complexity.
Marking Scheme: Each student is expected to do three homeworks
and a final project, which can be either theoretical or programming,
and present the project in class. Three homeworks 20% each,
final project and presentatoin 35%, 5% class attendance
(including guest lectures and student presentations) and
participation in discussions.
Course announcements and lecture notes will appear on this page.
You are responsible for reading it (and downloading the
lecture notes before classes) regularly.
Assignments:

The assignment 1 contains the following problems from the book: 2.1.1[a],
2.1.2, 2.1.3, 2.1.5, 2.1.13 (note, this one
requires "Symmetry of Information" page 182),
2.2.5, 2.2.6, 2.2.8, and optional 2.3.4. Because of delay, this
assignment will be due Oct 8, instead of Oct 1, in class.

Assignment 2. Do Exercises 2.4.2, 2.5.1, 2.5.2, 2.5.3, 2.7.5 from the text
book, and prove the following statement: If a binary sequence x
of length n has exactly n/2 0's, then x is not cincompressible, for large
enough n.

Assignment 3. Problem 1: Use Kolmorogov complexity
to prove that a 1 head 2way DFA cannot accept
L={w#w  w in {0,1}*}. Problem 2: Using Kolmorov complexity, obtain
the average case lower bound for binary search.
Problem 3: Using Kolmogorov complexity (directly, without using
theorems from Chapter 6), prove that {0^n 1^m  m>n} is not regular.
Problem 4. Do Excercise 6.8.5. Problem 5: Shaker sort
compares each adjacent pair of items in a list in turn, swapping
them if necessary, and alternately passes through the list from the
beginning to the end then from the end to the beginning. It stops when
a pass does no swaps. Using incompressibility argument, prove a tight
average case lower bound for Shaker sort. Problem 6. Do Exercise
6.3.1, using Kolmogorov complexity. Problem 7. Do Exercise
6.3.2, using Kolmogorov complexity.
Possible project topics:

Energytime trade off for information processing (lecture 4).
This is a theoretical research oriented project, related to Lecture 4.

Alternative normalization of E(x,y) in Lecture 4.

Experimental project. Compare information distance against
traditional distances for internet search, as discussed in Lecture
5 notes.

Informatoin distance approach applied to proteinprotein relationship. This
is proposed by Brona Brejova. This is probably suitable for
a bioinformatics student.

History and comparative study of information distance metrics.
(Talk to me, I will provide some papers.)

Simplify and improve averagecase analysis of
Quicksort using Kolmogorov complexity (I will provide the papers).

Improve the Shellsort averagecase lower bound (Open). This is very
difficult.

Incompressibility method: find a problem in your own research, make an
average case or lower bound analysis of your problem.

Please do not hesitate to discuss with me on both finding and solving
your project problems. Please let me know what you wish to
work on. We generally prefer people working
on the different projects, unless it has some research component.
Lectures:
Note: The lecture notes are constantly being improved, so it is a good idea
to download the most recent version before class.

Lecture 1. History, definition of Kolmogorov complexity, basics
(Chapters 1 and 2). 1 week.
Lecture 1 notes, ppt file

Lecture 2: Randomness. Definitions and MartinLof tests
(Sections 2.4, 2.5). 1 week.
Lecture 2 notes, ppt file

Lecture 3: Symmetry of Information (Section 2.8). 1 week.
Lecture 3 notes, ppt file

Lecture 4: Information Distance
normalized information distance and applications.
(Sections 8.2, 8.3, 8.4, 8.6
this
paper,
this
paper, and
this paper). 23 weeks.
Lecture 4 notes, ppt file). 2 weeks.
Lecture 4 QA notes). 2 weeks.

Lecture 5: The incompressibility method and its applications in
average case analysis and lower bounds (Chapter 6)
Lecture 5 notes, ppt file
Additional lecture 5 notes
by Tao Jiang, ppt file 2 weeks.

Lecture 5.1: Lecture notes for June 28. The incompressibility method
continued. Lecture 5.1 notes, ppt file
Sorry about being late for this one. Please download before class if you
see this. 1 week.

Lecture 6: Prefix Kolmogorov complexity (Chapter 3)
Lecture 6 notes, ppt file
1 week.

Lecture 7: Inductive Inference (Chapter 5)
Lecture 7 notes, ppt file
1 week.

Lecture 8: Kolmogorov complexity and the nature (Chapter 8)
Lecture 8 notes, ppt file
1/2 week.

Lecture 9: Resource bounded Kolmogorov complexity and computational
complexity (Chapter 7).
Lecture 9 notes, ppt file
For those who are interested in reading more, you can read
this Lance Fortnow notes). 1 week.

Student presentations: 2 weeks (1/2 hour each student).
Announcements:
Maintained by Ming Li