Homework 2
This homework covers the material from lectures 5 to 8
Due date: October 4th, 11:59pm Waterloo time.
PDF version of Homework 2.
LaTeX template, in case you want to write it in LaTeX.
Required Exercises
Remember that you are only required to turn in 5 out of the 6 exercises below.
Problem 1 - Bashing (20 points)
Perfect hashing is nice, but does have the drawback that the perfect hash function has a lengthy description (since you have to describe the second-level hash function for each bucket). Consider the following alternative approach to producing a perfect hash function with a small description. Define bi-bucket hashing, or bashing, as follows. Given
- Suppose a random function (i.e., all function values are uniformly random and mutually independent) is used to map each item to buckets. Give a good upper bound on the expected number of collisions (i.e., the number of pairs of items that are placed in the same bucket).
Hint: what is the probablility that the
-
Argue that bashing can be implemented efficiently, with the same expected outcome, using the ideas from 2-universal hashing.
-
Conclude an algorithm with linear expected time (ignoring array initialization) for identifying a perfect bash function for a set of
items. How large is the description of the resulting function?
Problem 2 - More hash functions (20 points)
Consider the following examples of hash families. For each one, prove that it is 2-universal or give a
counterexample. The universe for the hash functions is
- Let
be a prime number and be an integer. Let - Let
be a prime number and be an integer. Let - Let
be an integer multiple of , where and the universe is . Let - Let
be a prime number and be an integer. Let
Problem 3 - Fingerprinting (20 points)
Consider the problem of deciding whether two integer multisets
You can assume that the multisets
-
Use polynomial identity testing to give a
time algorithm for the problem above. -
Use hashing to give a
time algorithm for the problem above.
Your algorithms for both parts should succeed with probability
Problem 4 - Karger’s algorithm: the cut menace (20 points)
Another problem about Karger’s randomized algorithm for minimum cut:
- Suppose Karger’s algorithm is applied to a tree. Show that it finds a minimum cut in the tree with probability 1.
- Consider the following modification of Karger’s algorithm: instead of choosing an edge uniformly at random and merging the
endpoints, the algorithm chooses any two distinct vertices uniformly at random and merges them. Show that for any
there is a graph with vertices such that when the modified algorithm is run on , the probability that it finds a minimum cut is exponentially small in . - How many times would you have to repeat the modified algorithm of the previous part to have a reasonable chance of finding a minimum cut? What does this tell us about the practicality of the modified algorithm?
- Show that for any
there is a connected graph with vertices that has distinct minimum cuts.
Problem 5 - Karger strikes back (20 points)
To improve the probability of success of the randomized min-cut algorithm, it can be run multiple times.
-
Consider running the algorithm twice. Determine the number of edge contractions and bound the probability of finding a min-cut.
-
Consider the following variation. Starting with a graph with
vertices, first contract the graph down to vertices using the randomized min-cut algorithm. Make copies of the graph with vertices, and now run the randomized algorithm on these reduced graphs independently. Determine the number of edge contractions and bound the probability of finding a min-cut. -
Find optimal (or at least near-optimal) values of
and for the variation in the previous part that maximizes the probability of finding a min-cut while using the same number of edge contractions as running the original algorithm twice.
Problem 6 - Sublinear-time Algorithms (20 points)
Sublinear-time algorithms for connectedness in graphs with bounded degree.
Given a graph
For this problem, when proving the correctness of your algorithm, it is ok to show
that if the input graph
Practice Problems
You are not required to submit the solution to these problems, but highly encouraged to attempt them for your own knowledge sake. :)
Problem 1 - From Lecture 5
- Check that the hash family we constructed in slide 51 is strongly 2-universal. Is it strongly 3-universal?
- Prove that the hash family from slide 54 is 2-universal. Prove also that it is not strongly 2-universal.
- Prove proposition from slide 55
- Construct a family of 3-wise independent hash functions, which is not 4-wise independent. Prove that this family is 3-wise independent and not 4-wise independent.
(Hint: see slide 59.)
- Prove that the number of collisions is an upper bound on the square of the maximum load of a hash table.
- Use Markov’s inequality and the part 6 to prove the second lemma from slide 72.
Problem 2 - Testing Sortedness in Sublinear Time (courtesy of Nathan Harms)
In this question we mean that an array is sorted if it is sorted in increasing order.
Let
An
- if
is sorted, the algorithm output YES with probability 1 - if
is -far from sorted, the algorithm outputs NO with probability at least .
Question 1: the simplest idea for a tester is the random adjacent pair tester, which chooses a random index
Give an example array
Question 2: Another idea is to choose
Give an example of an array that is
Hint: remember the birthday paradox
Question 3: Assume each entry of
- For any set
of good indices, show that the subsequence is sorted. - Show that, if
is -far from sorted, then . - Show that there is an
-tester for sortedness that uses queries.
Hint: consider choosing a random index