Homework 1 - Spring 2024

This homework covers the material from lectures 1 to 6.

Due date: May 21st, 10pm Waterloo time.

PDF version of Homework 1.

LaTeX template, in case you want to write it in LaTeX.

Required Exercises

Problem 1 - Amortized Analysis (10 points)

Design a data structure to support the following two operations for a dynamic multiset $S$ of integers, which allows duplicate values:

INSERT( $S, x$ ) inserts $x$ into $S$
DELETE-LARGER-HALF( $S$ ) deletes the largest $⌈ | S | / 2 ⌉$ elements from $S$

Show how to implement this data structure so that any sequence of $m$ INSERT and DELETE-LARGER-HALF operations runs in $O (m)$ time. Your implementation should also include a way to output the elements of $S$ in $O (| S |)$ time.

Hint: to delete larger half, need to find the median. You can use the result that there is a linear time algorithm for finding the median in an unordered array (if you don’t know this result, you should learn about it - it is pretty cool).

Problem 2 - Amortized Analysis (10 points)

An ordered stack is a data structure that stores a sequence of items and supports the following operations:

OrderedPush( $x$ ) removes all items smaller than $x$ from the beginning of the sequence and then adds $x$ to the beginning of the sequence.
Pop() deletes and returns the first item in the sequence, or NULL if the sequence is empty.

Suppose we implement an ordered stack with a simple linked list, using the obvious OrderedPush and Pop algorithms. Prove that if we start with an empty data structure, the amortized cost of each OrderedPush or Pop operation is $O (1)$ .

Problem 3 - Concentration Inequalities (15 points)

The simplest model for a random graph consists of $n$ vertices, and tossing a fair coin for each pair ${i, j}$ to decide whether this edge should be present in the graph. Call this model $G (n, 1 / 2)$ . A triangle is a set of $3$ vertices with an edge between each pair.

What is the expected number of triangles?
What is the variance?
Give an expression for the bound in the decay of probability, and show that the number of triangles is concentrated around the expectation. That is, if $μ$ is the expectation of your random variable $X$ , show that $Pr [| X - μ | \geq δ μ] \leq Φ (δ, n)$ for some function $Φ$ . To show that it is concentrated around the expectation, for what value of $δ$ does $Φ (δ, n) \leq 1 / 4$ ?

Problem 4 - Concentration Inequalities (15 points)

In this problem, we are in the setting where given a set $S$ (which is not known to us - and this set does not have repeated elements), we only have access to $S$ by querying a random element from $S$ uniformly at random. Thus, if we want to sample $s$ elements from $S$ , we will obtain a sequence of elements $a_{1}, \dots, a_{s} \in S$ where each $a_{k}$ was drawn from $S$ uniformly at random. Thus it could be the case where $a_{i} = a_{j}$ for some $i \neq j$ .

Part 1 (5 points)

Show that given a set $S$ of $n$ numbers in $[0, 1]$ it is impossible to estimate the value of the median within $1.1$ multiplicative approximation factor with $o (n)$ samples.

Hint: to show an impossibility result you need to you need to show two distinct sets of $n$ numbers $S_{1}$ and $S_{2}$ that have very different medians but if someone gave you $o (n)$ samples and asked you to tell them from which set this sample came from, you would not be able to tell with high probability.

Part 2 (10 points)

Given a set $S$ of $n$ numbers as in part 1, now calculate the number of samples needed (as a function of $t$ ) so that the following is true: with high probability, the median of the sample has at least $n / 2 - t$ numbers from $S$ less than it and at least $n / 2 - t$ numbers of $S$ more than it.

Problem 5 - Balls and Bins (10 points)

Consider again the experiment in which we toss $m$ labeled balls at random into $n$ labeled bins, and let the random variable $X$ be the number of empty bins. We have seen that $E [X] = n \cdot {(1 - \frac{1}{n})}^{m}$ .

By writing $X = \sum_{i} X_{i}$ for suitable random variables $X_{i}$ , show how to derive the following exact formula for the variance of $X$ : $V a r [X] = n \cdot {(1 - \frac{1}{n})}^{m} + n (n - 1) \cdot {(1 - \frac{2}{n})}^{m} - n^{2} \cdot {(1 - \frac{1}{n})}^{2 m}$
What is the asymptotic value (as $n \to \infty$ ) of $V a r [X]$ in the cases $m = n$ and $m = n \ln (n)$ ?

Hint: you may use the approximations $(1 - x / n)^{n} \sim e^{- x} \cdot (1 - x^{2} / 2 n)$ and $(1 - 1 / n)^{x n} \sim e^{- x} \cdot (1 - x / 2 n)$

When throwing $n$ balls into $n$ bins, what is the expected number of bins with exactly one ball? Compute an exact formula and its asymptotic approximation.

Problem 6 - Coupon Collector (10 points)

Part 1 (5 points)

Let us consider the coupon collector problem: we toss $m = n \log n + c n$ balls into $n$ bins, where $c$ is a constant, and we are interested in the probability that there is no empty bin. We saw in class that $P r [some bin is empty] \leq n \cdot {(1 - \frac{1}{n})}^{m} \sim n \cdot \frac{1}{e^{m / n}} = \frac{1}{e^{c}} .$

Prove that $P r [some bin is empty] = Ω (\frac{1}{e^{c}} - \frac{1}{2 e^{2 c}}) .$

Hint: Inclusion-Exclusion.

Part 2 (5 points)

Consider the following variation of the coupon collector’s problem. Each box of cereal contains one of $2 n$ different coupons. The coupons are organized into $n$ pairs, so that coupons 1 and 2 are a pair, coupons 3 and 4 are a pair, and so on. Once you obtain one coupon from every pair, you can obtain a prize. Assuming that each coupon in each box is chosen independently and uniformly at random from the $2 n$ possibilities, what is the expected number of boxes you must buy before you can claim the prize?

Problem 7 - More hash functions (15 points)

Consider the following examples of hash families. For each one, prove that it is 2-universal or give a counterexample. The universe for the hash functions is $U = F_{p}$ and the image is ${0, \dots, n - 1}$ .

Part 1 (5 points)

Let $p$ be a prime number and $n \leq p$ be an integer. Let $H := {h_{a} (x) = (a x mod p) mod n ∣ a \in {1, \dots, p - 1}}$

Part 2 (5 points)

Let $m$ be an integer multiple of $n$ , where $m \geq n$ and the universe is ${0, \dots, m - 1}$ . Let $H := {h_{a, b} (x) = (a x + b mod m) mod n ∣ a \in {1, \dots, m - 1}, b \in {0, 1, \dots, m - 1}}$

Part 3 (5 points)

Let $p$ be a prime number and $n \leq p$ be an integer. Let $H := {h_{a, b} (x) = (a x + b mod p) mod n ∣ a, b \in {0, 1, \dots, p - 1}, a \neq 0}$

Problem 8 - Karger strikes back (15 points)

To improve the probability of success of the randomized min-cut algorithm, it can be run multiple times.

Consider running the algorithm twice. Determine the number of edge contractions and bound the probability of finding a min-cut.
Consider the following variation. Starting with a graph with $n$ vertices, first contract the graph down to $k$ vertices using the randomized min-cut algorithm. Make $ℓ$ copies of the graph with $k$ vertices, and now run the randomized algorithm on these reduced graphs independently. Determine the number of edge contractions and bound the probability of finding a min-cut.
Find optimal (or at least near-optimal) values of $k$ and $ℓ$ for the variation in the previous part that maximizes the probability of finding a min-cut while using the same number of edge contractions as running the original algorithm twice.

Practice Problems

You are not required to submit the solution to these problems, but highly encouraged to attempt them for your own knowledge sake. :)

Problem 1 - Amortized Analysis - from CLRS chapter 17

A stack $S$ is a data structure that supports the following operations:

PUSH( $S, x$ ) pushes object $x$ on the top of stack $S$
POP( $S$ ) pops the top of the stack $S$ and returns the popped object. If one calls POP on an empty stack it returns an error.

We will now modify the stack structure to support the following operation:

MULTIPOP( $S, k$ ) removes the $k$ top objects of stack $S$ , popping the entire stack if the number of elements in $S$ is less than $k$ .

First Question: what is the worst case performance per operation of a sequence of $n$ operations above? Here we want you to give an upper bound on the complexity of the operations PUSH, POP and MULTIPOP.

Second Question: what is the amortized cost of $n$ operations? Use the three methods of amortized analysis to obtain the same bound.

Problem 2 - Amortized Analysis - from Jeff Erickson’s book

Suppose we can insert or delete an element into a hash table in $O (1)$ time, and that we can construct a hash table of size $n$ in $C n$ time, where $C > 0$ is some absolute constant. In order to ensure that our hash table is always big enough, without wasting a lot of memory, we will use the following global rebuilding rules:

After an insertion, if the table is more than $3 / 4$ full, we allocate a new table twice as big as our current table, insert everything into the new table, and then free the old table.
After a deletion, if the table is less than $1 / 4$ full, we allocate a new table half as big as our current table, insert everything into the new table, and then free the old table.

Show that, starting from an empty hash table, for any sequence of insertions and deletions, the amortized time per operation is still $O (1)$ .

Problem 3 - Splay Trees

In this problem we will work out some splaying counterexamples;

Single rotations don’t work for splay trees. To show this, consider a degenerate $n$ -node linked-list shaped binary tree where each node’s right child is empty. Suppose that the (only) leaf is splayed to the root by single rotations. Show the structure of the tree after this splay. Generalize this to argue that there is a sequence of $n / 2$ splays (each splay using only single rotations) that each take at least $n / 2$ work.
Now from the same starting tree, show the final structure after splaying the leaf with (zig-zig) double rotations. Explain how this splay has made much more progress than single rotations in “improving” the tree.
Given the theorem about access time in splay trees, it is tempting to conjecture that splaying does not create trees in which it would take a long time to find an item. Show this conjecture is false by showing that for large enough $n$ , it is possible to restructure any binary tree on $n$ nodes into any other binary tree on $n$ nodes by a sequence of splay requests. Conclude that it is possible to make a sequence of requests that cause the splay tree to achieve any desired shape.

Hint: start by showing how you can use splay operations to make a specified node into a leaf, then recurse.

Problem 4 - Probability - from MU'05 chapter 1

Consider the set ${1, \dots, n}$ . We generate a subset $X$ of this set as follows: a fair coin is flipped independently for each element of the set; if the coin lands heads then the element is added to $X$ , and otherwise it is not. Argue that the resulting set $X$ is equally likely to be any one of the $2^{n}$ possible subsets.
Suppose that two sets $X$ and $Y$ are chosen independently and uniformly at random from all the $2^{n}$ subsets of ${1, \dots, n}$ . Determine $P r [X \subseteq Y]$ and $P r [X \cup Y = {1, \dots, n}]$ .

Hint: use part 1 of the problem.

Problem 5 - Probability

Consider the following probabilistic process:

start with $c t r = 0$
Repeat $n$ times: toss a fair coin, and if it comes up heads, make $c t r + = 1$ . Otherwise, make $c t r - = 1$ .

If $X$ is the random variable defined as the value of our counter $c t r$ at the end of the process above:

Compute $E [X]$
For even $n$ , show that $Pr [X = 0] \sim \sqrt{2 / (π n)}$ . What happens for odd $n$ ?

Problem 6 - Covariance and Variance of Correlated Random Variables

In this question we will explore how to use covariance to compute the variance of correlated random variables

Let $X_{1}, \dots, X_{n}$ be random variables such that $X_{i} = X_{j}$ and $X_{1} = {\begin{cases} 1, with probability 1 / 2 \\ 0, otherwise \end{cases}$ thus, we are in the case where the random variables are the same random variable.

Let $X = X_{1} + X_{2} + \dots + X_{n}$ . Compute $V a r [X]$ .

Let $X_{1}, \dots, X_{n}$ be random variables such that $X_{1} = {\begin{cases} 1, with probability 1 / 2 \\ 0, otherwise \end{cases}$ and for $2 \leq i \leq n$ we have $X_{i} = {\begin{cases} X_{1}, with probability 1 / (2 i) \\ 1 - X_{1}, otherwise \end{cases}$

Let $X = X_{1} + X_{2} + \dots X_{n}$ . Compute $V a r [X]$ .

Problem 7 - Martingales

Consider a fair game in a casino: on each play, you may stake any amount $S$ of money. You win or lose with probability $1 / 2$ each time (and all instances of the game are independent). If you win you get back your stake plus $S$ , and if you lose you lose your stake.

What is the expected number of plays until you first win?
The following gambling strategy, known as the “martingale,” was popular in European casinos in the 18th century: on the first play, stake $1$ ; on the second play stake $2$ ; on the $k^{t h}$ play, stake $2^{k - 1}$ . Stop playing and leave the casino when you first win!

Show that if you follow the martingale strategy, you will leave the casino $1$ unit of money richer with probability 1! (this is probably a reason why this strategy is banned in most casinos. Also, this course is not providing any gambling advice)

Let us know discuss the catch in the strategy above: let $X$ be the random variable that measures your maximum loss before winning. Show that $E [X] = \infty$ . What does this imply about your ability to play the matringale strategy in practice?

Problem 8 - Generating random strings

Alice and Bob have a fair coin and they are trying to generate a random sequence of 2n coin tosses containing exactly $n$ heads and $n$ tails.

Alice suggests to flip the coin $2 n$ times. If we get exactly $n$ heads, output the sequence; otherwise, try again. How many tosses do you expect Alice will have to make under this scheme?
Bob claims that he has a much more efficient scheme: flip the coin until you have either $n$ heads or $n$ tails. Output this sequence, padded at the end with tails or heads respectively to make the total length $2 n$ . Clearly this scheme requires at most $2 n$ tosses. Is this a good scheme? Justify your answer with a proof.
Suggest a simple scheme for solving this problem that is better than the previous two schemes. How many tosses does your scheme require in expectation?

Problem 9 - Concentration Inequalities

Questions from lecture 3:

Is Markov’s inequality tight? If it is, give examples of random variables for which it is tight.
Give examples of discrete probability distributions which do not have a $k^{t h}$ moment. Also give examples of discrete probability distributions which do not have a $k^{t h}$ central moment.
Can you generalize Chebyshev’s inequality for higher order moments?
Work out the proof of the lower tail bound for Chernoff’s bound.
Work out the proof of Hoeffding’s generalization of Chernoff’s inequality.

Problem 10 - Reservoir Sampling - from Mu'05 chapter 2

Suppose we have a sequence of items passing by one at a time. We want to maintain a sample of one item with the property that it is uniformly distributed over all the items that we have seen at each step. Moreover, we want to accomplish this without knowing the total number of items in advance or storing all of the items that we see.

Consider the following algorithm, which stores just one item in memory at all times. When the first item appears, it is stored in memory. When the $k^{t h}$ item appears, it replaces the item in memory with probability $1 / k$ . Show why this algorithm solves the problem.

Note: the above algorithm is called reservoir sampling.

Suppose that we modify the reservoir sampling algorithm from the previous part so that, when the $k^{t h}$ item appears, it replaces the item in memory with probability $1 / 2$ . Describe the distribution of the item in memory.

Problem 11 - Exploring Chernoff

In this question, we will think about the following: when are Chernoff bounds actually giving us any non-trivial information?

Let $p \in (0, 1 / 10)$ and $X$ be the random variable which takes value $1 / p$ with probability $p$ , and takes the value $0$ with probability $1 - p$ . Let $ε = 1 / 2$ .
- what is the expectation of $X$ ?
- why can we use Chernoff here, since $X$ as defined is not Poisson Trial? What do we have to do to use Chernoff here?
- what does Chernoff say about $Pr [| X - E [X] | \geq ε \cdot E [X]]$ ?
One may be tempted to think in the example above that the reason why Chernoff didn’t give any non-trivial information was because the expectation is small compared to $ε^{2}$ . However, consider the following random variable: $Y = α \cdot X$ , where $X$ is the random variable of part 1 and $α$ can be any positive real number.
Prove that even though the expectation can now be large compared to $ε^{2}$ , Chernoff still does not say anything non-trivial.

Hint: for Chernoff we need Poisson trials.

What is the minimum number of independent copies $X_{i}$ of $X$ needed if we want Chernoff to say anything meaningful for $Pr [| Z - E [Z] | \geq ε \cdot E [Z]]$ , where $Z = \sum_{i = 1}^{t} X_{i}$ ?

Problem 12 - Bashing

Perfect hashing is nice, but does have the drawback that the perfect hash function has a lengthy description (since you have to describe the second-level hash function for each bucket). Consider the following alternative approach to producing a perfect hash function with a small description. Define bi-bucket hashing, or bashing, as follows. Given $n$ items, allocate two arrays of size $n^{3 / 2}$ . When inserting an item, map it to one bucket in each array, and place it in the emptier of the two buckets.

Suppose a random function (i.e., all function values are uniformly random and mutually independent) is used to map each item to buckets. Give a good upper bound on the expected number of collisions (i.e., the number of pairs of items that are placed in the same bucket).

Hint: what is the probablility that the $k^{t h}$ inserted item collides with some previously inserted item?

Argue that bashing can be implemented efficiently, with the same expected outcome, using the ideas from 2-universal hashing.
Conclude an algorithm with linear expected time (ignoring array initialization) for identifying a perfect bash function for a set of $n$ items. How large is the description of the resulting function?

Problem 13 - From Lecture 5

Check that the hash family we constructed in slide 51 is strongly 2-universal. Is it strongly 3-universal?
Prove that the hash family from slide 54 is 2-universal. Prove also that it is not strongly 2-universal.
Prove proposition from slide 55
Construct a family of 3-wise independent hash functions, which is not 4-wise independent. Prove that this family is 3-wise independent and not 4-wise independent.

(Hint: see slide 59.)

Prove that the number of collisions is an upper bound on the square of the maximum load of a hash table.
Use Markov’s inequality and the part 6 to prove the second lemma from slide 72.

Problem 14 - Karger’s algorithm: the cut menace

Another problem about Karger’s randomized algorithm for minimum cut:

Suppose Karger’s algorithm is applied to a tree. Show that it finds a minimum cut in the tree with probability 1.
Consider the following modification of Karger’s algorithm: instead of choosing an edge uniformly at random and merging the endpoints, the algorithm chooses any two distinct vertices uniformly at random and merges them. Show that for any $n$ there is a graph $G_{n}$ with $n$ vertices such that when the modified algorithm is run on $G_{n}$ , the probability that it finds a minimum cut is exponentially small in $n$ .
How many times would you have to repeat the modified algorithm of the previous part to have a reasonable chance of finding a minimum cut? What does this tell us about the practicality of the modified algorithm?
Show that for any $n \geq 3$ there is a connected graph $G_{n}$ with $n$ vertices that has $n (n - 1) / 2$ distinct minimum cuts.

Last updated on May 5, 2023

Edit this page