CS 860: Kolmogorov complexity and
its applications
Winter 2010
Home 

Home
 Fun readings
Instructor: 
Ming
Li 
DC 3355, x84659 
mli@uwaterloo.ca 
Course time:  Mondays 3:306:20pm 
Office hours:  Mondays 13pm 
Textbook:  Ming Li and Paul Vitanyi, An
introduction to Kolmogorov
complexity and its applications. 2n edition or 3rd edition, Springer,
2008 
The course material
will be mainly from our textbook, the ppt files,
and papers that will be provided
at this website. The students will do a research project at
the end of the term, and present it in class.
Note that this course is not a bioinformatics course. It is a
general course on Kolmogorov complexity and many of its applications.
Kolmogorov complexity is a modern theory of information and randomness.
The goal of this course is to teach the students the basic concepts of
Kolmogorov complexity and how to use this tool in their own research.
Topics include: plain Kolmogorov complexity, randomness,
prefix Kolmogorov complexity, incompressibility method,
information distance, applications in various fields ranging from
averagecase analysis of algorithms, Lovasz Local Lemma,
to bioinformatics; and from
document comparison to internet search. We will not be able to cover
the complete textbook, rather we will focus more on recent
developments and especially on applications of Kolmogorov
complexity.
Marking Scheme: Each student is expected to do three homeworks (20% each)
and a final project and presentation in class (35%),
which can be either theoretical or programming,
and 5% class attendance and participation in discussions.
Course announcements and lecture notes will appear on this page.
You are responsible for reading it (and downloading the
lecture notes before classes) regularly.
Assignments:
(All exercise numbers refer to both the 2nd edition and 3rd edition
of the textbook unless otherwise stated)

The assignment 1 (Due Jan. 25 in class)
contains the following problems from the book:
2.1.1[a], 2.1.2, 2.1.3, 2.1.5, 2.1.13 (This one is 2.8.5 in
the 3rd edition, as this one
requires "Symmetry of Information"),
2.2.5, 2.2.6, 2.2.8, and optional 2.3.4.

Assignment 2 (Due Feb 8th in class).
Do Exercises 2.4.2, 2.5.1, 2.5.2, 2.5.3, 2.7.5 from the text
book, and prove the following statement: If a binary sequence x
of length n has exactly n/2 0's, then x is not cincompressible, for large
enough n.

Assignment 3 (Due Feb 27th in class). Problem 1: Use Kolmorogov complexity
to prove that a 1 head 2way DFA cannot accept
L={w#w  w in {0,1}*}. Problem 2: Using Kolmogorov complexity, obtain
the average case lower bound for binary search.
Problem 3: Using Kolmogorov complexity (directly, without using
theorems from Chapter 6), prove that {0^n 1^m  m>n} is not regular.
Problem 4. Do Excercise 6.8.5. Problem 5: Shaker sort
compares each adjacent pair of items in a list in turn, swapping
them if necessary, and alternately passes through the list from the
beginning to the end then from the end to the beginning. It stops when
a pass does no swaps. Using incompressibility argument, prove a tight
average case lower bound for Shaker sort. Problem 6. Do Exercise
6.3.1, using Kolmogorov complexity. Problem 7. Do Exercise
6.3.2 (This is 6.3.3 in the 3rd edition), using Kolmogorov complexity.
Possible project topics:

Energytime trade off for information processing (lecture 4).
This is a theoretical research oriented project, related to Lecture 4.

Alternative normalization of E(x,y) in Lecture 4.

Information distance among many objects, and possible extensions.

Interesting applications of Kolmogorov complexity to LLL type of problems.

Experimental project. Compare information distance against
traditional distances for internet search, as discussed in Lecture
5 notes.

Simplify and improve averagecase analysis of
Quicksort using Kolmogorov complexity (I will provide the papers).

Improve the Shellsort averagecase lower bound (Open). This is very
difficult.

Apply Kolmogorov complexity to a problem in your own research.

Please do not hesitate to discuss with me on both finding and solving
your project problems. Please let me know what you wish to
work on. I generally prefer people working
on the different projects.

Fix a major error in the book (3rd edition) and present a research
topic in that direction.
(For example, fix Exercise 6.3.9, on page 459 and present that
proof and perhaps related results you could generalize from
that in class.)

New project: use information distance approach to detect breast cancer
sites in mammogram images. See this paper
by B.J.L. Campana
and E. Keogh.

New Project: close the gap of logn upper bound
and (1/2)logn lower bound for the
following problem: given a sequence of k stacks linearly connected
from left to right. The input is a permutation of n integers to the
leftmost stack. The integers move in order (not passing each other
unless they go into stacks) from left to right and any of them can go into
any stack when they arrive at that stack and popped out at some subsequent
steps. The output is from the rightmost stack and should be sorted
in increasing order. How many stacks do we need?
Note: Luke Schaeffer has improved (1/2)logn to 0.513logn.
Lectures:
Note: The lecture notes are constantly being improved, so it is a good idea
to download the most recent version before class.

Lecture 1. History, definition of Kolmogorov complexity, basics
(Chapters 1 and 2). 1 week.
Lecture 1 notes, ppt file

Lecture 2: Randomness. Definitions and MartinLof tests
(Sections 2.4, 2.5). 1 week.
Lecture 2 notes, ppt file

Lecture 3: Symmetry of Information (Section 2.8). 1 week.
Lecture 3 notes, ppt file

Lecture 4: Information Distance
normalized information distance and applications.
(Sections 8.2, 8.3, 8.4, 8.6
this
paper,
this
paper, and
this paper). 23 weeks.
Lecture 4 notes, ppt file). 2 weeks.
Lecture 4 QA notes). 2 weeks.

Lecture 5: The incompressibility method and its applications in
average case analysis and lower bounds (Chapter 6)
Lecture 5 notes, ppt file
Additional lecture 5 notes
by Tao Jiang, ppt file 2 weeks.

Lecture 5.1: Lecture notes for June 28. The incompressibility method
continued. Lecture 5.1 notes, ppt file
Sorry about being late for this one. Please download before class if you
see this. 1 week.

Lecture 6: Prefix Kolmogorov complexity (Chapter 3)
Lecture 6 notes, ppt file
1 week.

Lecture 7: Inductive Inference (Chapter 5)
Lecture 7 notes, ppt file
1 week.

Lecture 8: Kolmogorov complexity and the nature (Chapter 8)
Lecture 8 notes, ppt file
1/2 week.

Lecture 9: Resource bounded Kolmogorov complexity and computational
complexity (Chapter 7).
Lecture 9 notes, ppt file
For those who are interested in reading more, you can read
this Lance Fortnow notes). 1 week.

Student presentations: 2 weeks (1/2 hour each student).
Announcements:
Maintained by Ming Li