CS 898: Applications of Kolmogorov complexity
Spring 2014
Home 

Home

Instructor: 
Ming
Li 
DC 3355, x84659 
mli@uwaterloo.ca 
Course time and location:  Wednesdays
1:003:50pm, DC 2568 
Office hours:  Wednesdays: 46pm, or
come by my office DC 3355 
Textbook:  Ming Li and Paul Vitanyi, An
introduction to Kolmogorov
complexity and its applications. 3rd edition, Springer,
2008 
The course material
will be mainly from the textbook, research papers, and the ppt lecture
notes.
The students will do a research project at
the end of the term, and present it in class.
This course will focus on the applications of Kolmogorov
complexity,
rather than its theory.
Kolmogorov complexity is a modern theory of information and
randomness.
The goal of this course is to teach the students the basic concepts of
Kolmogorov complexity and how to use this tool in their own research.
Topics include: plain Kolmogorov complexity, randomness,
prefix Kolmogorov complexity, incompressibility method,
information distance, applications in various fields ranging from
averagecase analysis of algorithms, Lovasz Local Lemma,
to bioinformatics; and from
document comparison to internet search. We will not cover
the complete textbook, rather we will focus more on recent
developments but we will focus on the applications, especially I wish
to spend more time on new
potential applications including natural language processing and
compressive sensing.
Marking Scheme: Each student is expected to do three homeworks (20%
each)
and a final project and presentation in class (35%),
which can be either theoretical or programming,
and 5% class attendance and participation in discussions.
Course announcements and lecture notes will appear on this page.
You are responsible for reading it (and downloading the
lecture notes before classes) regularly.
Assignments:
(All exercise numbers refer to both the 2nd edition and 3rd edition
of the textbook unless otherwise stated)

The assignment 1 (Due May 21 in class)
contains the following problems from the book:
2.1.1[a], 2.1.2, 2.1.3, 2.1.5, 2.1.13 (This one is 2.8.5 in
the 3rd edition, as this one requires "Symmetry of Information"),
2.2.5, 2.2.6, 2.2.8, and optional 2.3.4.

Assignment 2 (Due June 4 in class  extension to June 11)).
Do Exercises 2.4.2, 2.5.1, 2.5.2, 2.5.3, 2.7.5 from the text
book, and prove the following statement: If a binary sequence x
of length n has exactly n/2 0's, then x is not cincompressible, for
large
enough n.

Assignment 3 (Due June 25 in class). Problem 1: Use Kolmorogov
complexity
to prove that a 1 head 2way DFA cannot accept
L={w#w  w in {0,1}*}. Problem 2: Using Kolmogorov complexity, obtain
the average case lower bound for binary search.
Problem 3: Using Kolmogorov complexity (directly, without using
theorems from Chapter 6), prove that {0^n 1^m  m>n} is not regular.
Problem 4. Do Excercise 6.8.5. Problem 5: Shaker sort
compares each adjacent pair of items in a list in turn, swapping
them if necessary, and alternately passes through the list from the
beginning to the end then from the end to the beginning. It stops when
a pass does no swaps. Using incompressibility argument, prove a tight
average case lower bound for Shaker sort. Problem 6. Do Exercise
6.3.1, using Kolmogorov complexity. Problem 7. Do Exercise
6.3.2 (This is 6.3.3 in the 3rd edition), using Kolmogorov complexity.
Possible project topics:

Kolmogorov complexity and compressive sensing.

Approximating semantics. Apply Kolmogorov complexity to model
natural language processing.

Big data and Kolmogorov complexity. Model data entities and big data
meanings.

Energytime trade off for information processing (lecture 4).
This is a theoretical research oriented project, related to Lecture 4.

Alternative normalization of E(x,y) in Lecture 4.

Information distance among many objects, and possible extensions.

Interesting applications of Kolmogorov complexity to LLL type of
problems.

Experimental project. Compare information distance against
traditional distances for internet search, as discussed in Lecture
5 notes.

Simplify and improve averagecase analysis of
Quicksort using Kolmogorov complexity (I will provide the papers).

Extend the the Kolmogorov complexity proof of LLL to related problems.

Improve the Shellsort averagecase lower bound (Open). This is very
difficult.

Apply Kolmogorov complexity to a problem in your own research.

Please do not hesitate to discuss with me on both finding and solving
your project problems. Please let me know what you wish to
work on. I generally prefer people working
on the different projects.

Use information distance approach to detect breast cancer
sites in mammogram images. See this
paper
by B.J.L. Campana and E. Keogh.

Close the gap of logn upper bound
and (1/2)logn lower bound for the
following problem: given a sequence of k stacks linearly connected
from left to right. The input is a permutation of n integers to the
leftmost stack. The integers move in order (not passing each other
unless they go into stacks) from left to right and any of them can go
into
any stack when they arrive at that stack and popped out at some
subsequent
steps. The output is from the rightmost stack and should be sorted
in increasing order. How many stacks do we need?
Or demonstrate why Kolmogorov complexity cannot be used for further
improvement.
Note: Luke Schaeffer has improved (1/2)logn to 0.53logn.
Lectures:
Note: The lecture notes are constantly being improved, so it is a good
idea
to download the most recent version before class.

Lecture 1. History, definition of Kolmogorov complexity, basics
(Chapters 1 and 2). 1 week.
Lecture 1 notes, ppt file

Lecture 2: Randomness. Definitions and MartinLof tests
(Sections 2.4, 2.5). 1 week.
Lecture 2 notes, ppt file

Lecture 3: Symmetry of Information (Section 2.8). 1 week.
Lecture 3 notes, ppt file

Lecture 4: Information Distance
normalized information distance and applications.
(Sections 8.2, 8.3, 8.4, 8.6
this
paper,
this
paper, and
this paper). 23 weeks.
Lecture 4 notes, ppt
file.
Lecture 4 QA notes.
Lecture 4 Semantics notes.

Lecture 5: The incompressibility method and its applications in
average case analysis and lower bounds (Chapter 6)
Lecture 5 notes, ppt
file
Additional lecture 5 notes
by Tao Jiang, ppt file 2 weeks.

Lecture 5.1: Lecture notes for June 28. The incompressibility method
continued. Lecture 5.1 notes, ppt
file
Sorry about being late for this one. Please download before class if
you
see this. 1 week.

Lecture 6: Prefix Kolmogorov complexity (Chapter 3)
Lecture 6 notes, ppt file
1 week.

Lecture 7: Inductive Inference (Chapter 5)
Lecture 7 notes, ppt file
1 week.

Lecture 8: Kolmogorov complexity and the nature (Chapter 8)
Lecture 8 notes, ppt file
1/2 week.

Lecture 9: Resource bounded Kolmogorov complexity and computational
complexity (Chapter 7).
Lecture 9 notes, ppt file
For those who are interested in reading more, you can read
this Lance Fortnow notes). 1 week.

Student presentations: 2 weeks (1/2 hour each student).
Announcements:
Maintained by Ming Li