Homework 3

This homework covers the material from lectures 8 to 11

Due date: June 16th, 11:59pm Waterloo time.

LaTeX template, in case you want to write it in LaTeX.

Required Exercises

Remember that you are only required to turn in 5 out of the 6 exercises below.

Problem 1 - Sublinear-time Algorithms (20 points)

Sublinear-time algorithms for connectedness in graphs with bounded degree.

Given a graph $G$ of maximum degree $d$ (as adjacency list), and a parameter $\epsilon > 0$, give an algorithm which has the following behavior: if $G$ is connected, then the algorithm should output PASS with probability $1$, and if $G$ is $\epsilon$-far from connected (at least $\epsilon \cdot n \cdot d$ edges must be added to connect $G$), then the algorithm should output FAIL with probability at least $3/4$. Your algorithm should look at a number of edges that is independent of $n$, and polynomial in $d, \epsilon$.

For this problem, when proving the correctness of your algorithm, it is ok to show that if the input graph $G$ is likely to be passed, then it is $\epsilon$-close to a graph $G_0$ which is connected, without requiring that $G_0$ has degree at most $d$.

Problem 2 - Markov Chains (20 points)

  1. Show that if a Markov chain with transition matrix $P$ is irreducible and has a state $i$ such that $P_{i,i} > 0$, then it is also aperiodic.
  2. Let $a, b$ be positive integers and consider the Markov chain with state space $$ \{ (i,j) \mid 0 \leq i \leq a-1, \ \ 0 \leq j \leq b-1 \} $$ where $i,j$ are integers, and the following transition mechanism: if the chain is in state $(i,j)$ at time $t$, then at time $t+1$ it moves to $( (i+1) \bmod a, j)$ with probability $1/2$ or to $( i, (j+1) \bmod b)$ with probability $1/2$.

Show that this Markov chain is irreducible, and show that it is aperiodic if, and only if, $\gcd(a,b) = 1$.

  1. Consider a chessboard with a lone white king making random (king) moves, meaning that at each move, he picks one of the possible squares to move to, uniformly at random. Is the corresponding Markov chain irreducible and/or aperiodic? If so, what is the stationary distribution?

Problem 3 - Random Walks (20 points)

  1. Consider a Markov chain on the vertices of a triangle: the chain moves from a vertex to one of the other two vertices with probability $1/2$ each. Find the stationary distribution and $\varepsilon$-mixing time of this Markov chain. Also, find the probability that the chain is at the starting vertex after $n$ steps, for $n \geq 1$.

  2. Suppose that we alter the transition probabilities of the Markov chain in the previous question as follows: $$ p_{12} = p_{23} = p_{31} = 2/3, \ \ \ \ p_{21} = p_{32} = p_{13} = 1/3.$$ Find the stationary distribution and the $\varepsilon$-mixing time of this Markov chain. Also, find the probability that the chain is at the starting vertex after $n$ steps, for $n \geq 1$.

Problem 4 - Pagerank (20 points)

Suppose someone searches a keyword (like “car”) and we would like to identify the webpages that are the most relevant for this keyword and the webpages that are the most reliable sources for this keyword (a page is a reliable source if it points to many of the most relevant pages).

First we identify the pages with this keyword and ignore all other pages. Then we run the following ranking algorithm on the remaining pages.

  • Each vertex corresponds to a remaining page, and there is a directed edge from page $i$ to page $j$ if there is a link from page $i$ to page $j$. Call this directed graph $G(V, E)$, where $n = |V|$.
  • For each vertex $i$, we have two values $s(i)$ and $r(i)$, where intendedly $r(i)$ represents how relevant is this page and $s(i)$ represents how reliable it is as a source (the larger the values the better).
  • We start from some arbitrary initial values, say $s(i) = 1/n$ for all $i$, as we have no idea of their relevance at the beginning.
  • At each step, we update $s$ and $r$ (where $s$ and $r$ are vectors whose $i^{th}$ entries are $s(i)$ and $r(i)$) as follows:
  • First we update $$r(i) = \sum_{j: (j,i) \in E} s(j)$$ for all $i$, as a page is more relevant if it is linked by many reliable sources.
  • Then we update $$s(i) = \sum_{j: (i,j) \in E} r(j)$$ for all $i$ (using the just updated values $r(j)$), as a page is a more reliable source if it points to many relevant pages.
  • To keep the values small, we let $R = \sum_{i=1}^n r(i)$ and $S = \sum_{i=1}^n s(i)$, and divide each $s(i)$ by $S$ and each $r(i)$ by $R$.
  • We repeat this step for many times to refine the values.

Let $s, r \in \mathbb{R}^n$ be the vectors of the $s$ and $r$ values.

  1. Give a matrix formulation for computing $s$ and $r$.
  2. Suppose $G$ is weakly connected (that is, when we ignore the direction of the edges, the underlying undirected graph is connected) and there is a self-loop at each vertex. Prove that there is a unique limiting $s$ and a unique limiting $r$ for any initial $s$ as long as $s \geq 0$ and $s \neq 0$.

Hint: use the Perron-Frobenius theorem.

Problem 5 - Linear Programming (20 points)

Let $S = \{ x \in \mathbb{R}^n \mid A x \leq b \}$ and $T = \{ x \in \mathbb{R}^n \mid B x \leq c \}$, where $A \in \mathbb{R}^{m \times n}$, $b \in \mathbb{R}^m$, $B \in \mathbb{R}^{t \times n}$ and $c \in \mathbb{R}^t$. Given $A, B, b, c$ as an input, give a polynomial time algorithm for the problem of checking whether $S \subset T$.

Problem 6 - LP Duality (20 points)

For a directed graph $G(V, E)$, with $2$ special vertices $s, t \in V$, let $c_e$ be the capacity of edge $e \in E$ and consider the max-flox LP: $$ \max \sum_{P \ s\text{-}t \text{ path of } G} f_P $$ subject to $$ \sum_{e \in P} f_P \leq c_e \quad \forall e \in E, $$ $$ 0 \leq f_P \quad \text{ for all paths } P. $$ The first constraint says that the total flow on all paths from $s$ to $t$ is at most the capacity of each edge, and the second constraint says that the flow on each path is non-negative.

Give the dual of this LP and explain how to interpret the dual LP as a minimum cut problem.

Practice Problems

You are not required to submit the solution to these problems, but highly encouraged to attempt them for your own knowledge sake. :)

Problem 1 - Testing Sortedness in Sublinear Time (courtesy of Nathan Harms)

In this question we mean that an array is sorted if it is sorted in increasing order.

Let $A$ be an array of $n$ integers, and let $\varepsilon \in (0,1)$. We say that $A$ is $\varepsilon$-far from sorted if every sorted array $B$ has at least $\varepsilon n$ entries that are different from $A$. In other words, at least $\varepsilon n$ entries of $A$ must be modified to turn it into a sorted array.

An $\varepsilon$-tester for sortedness with query complexity $q$ is a randomized algorithm that queries a sequence $i_1, \ldots, i_q$ of indices from $A$ and outputs YES or NO as follows:

  1. if $A$ is sorted, the algorithm output YES with probability 1
  2. if $A$ is $\varepsilon$-far from sorted, the algorithm outputs NO with probability at least $2/3$.

Question 1: the simplest idea for a tester is the random adjacent pair tester, which chooses a random index $i \in \{ 1, \ldots, n-1 \}$, queries $A[i]$ and $A[i+1]$ and rejects if $A[i] > A[i+1]$. Repeat this procedure $q$ times and reject if at any query we output reject.

Give an example array $A$ that is $1/2$-far from sorted, where this algorithm with $q=1$ repetitions outputs no with probability at most $1/(n-1)$. Conclude that the tester must repeat at least $q = \Omega(n)$ times, so the query complexity in this case is $\Omega(n)$

Question 2: Another idea is to choose $q$ independently, uniformly random indices $i_1, \ldots, i_q$ and output NO if the subsequence $A[i_1], A[i_2], \ldots, A[i_q]$ is not sorted.

Give an example of an array that is $1/2$-far from sorted where this algorithm requires $q = \Omega(\sqrt{n})$ queries.

Hint: remember the birthday paradox

Question 3: Assume each entry of $A$ is distinct. Consider the set $G$ of good indices of $A$, where index $i$ is good if using binary search on $A$ to find $A[i]$ will successfully find the index $i$.

  1. For any set $G = \{ i_1, \ldots, i_g \}$ of good indices, show that the subsequence $A[i_1], \ldots, A[i_g]$ is sorted.
  2. Show that, if $A$ is $\varepsilon$-far from sorted, then $|G| \leq (1-\varepsilon)n$.
  3. Show that there is an $\varepsilon$-tester for sortedness that uses $O(\frac{1}{\varepsilon} \log n)$ queries.

Hint: consider choosing a random index $i ~ \{1, \ldots, n\}$, querying $A[i]$, then trying to use binary search to find element $A[i]$.

Problem 2 - Mixing Time of Random Walks

Compute the mixing time (both upper bounds and lower bounds) of the dumbbell graph: the graph on $2n$ nodes that consists of two complete graphs on $n$ nodes joined by a single edge, where $n \geq 3$.

Hint: try breaking up the graph into 4 pieces which converge to the stationary distribution in the appropriate time, and then show that any other probability distribution converges quickly to one of these pieces.

Alternative Hint: you can solve this by using elementary probability calculations

Problem 3 - Random Walks (more advanced material)

A cat and a mouse each independently take a random walk on a connected, undirected, non-bipartite graph $G$. They start at the same time on different nodes, and each makes one transition at each time step. The cat catches the mouse if they are ever at the same node at some time step. Show an upper bound of $O(m^2 n)$ on the expected time before the cat catches the mouse, where $n$ is the number of vertices and $m$ is the number of edges of $G$.

Problem 4 - Linear Programming

There are $n$ terminals on a circuit board. Each terminal is either positive or negative. A wire will be attached to each terminal. It must be the case that for every pair of terminals of opposite polarity the two wires must be long enough so that they can be made to touch.

E.g., if there’s a positive terminal at point $a = (0, 0)$, and a negative terminal at $b = (0, 10)$, and another negative terminal at $c = (10, 0)$ then one solution is to put a wires of length $5$ at all three terminals. Or to put a wire of length $10$ at $a$ and length zero at the others.

Or, for any $x \in [0, 10]$, wires of length $\geq 10 - x$ at $a$, and length $\geq x$ at both $b$ and $c$.

So given the locations of the terminals (and their polarities), the problem is to compute the shortest total length of wires needed to satisfy the stated requirement.

  1. Consider the following specific instance of the problem. All the terminals are along the $x$-axis. There are positive terminals at $0, 110, 111, 112$. And there are $2$ negative terminals at $99$ and $100$.

What’s the optimal solution to the problem?

(You don’t need to prove your solution is optimal.)

  1. Formulate this problem as an LP. (Illustrate your LP by writing this for an example with two positive and two negative terminals.)
  2. Write the dual of the LP in part (2.). (Again, write down this dual for the two terminal example in part (2.).)
  3. Show how to compute an optimal solution to the dual problem of part (3.), without using a general LP solver.

Hint: see Seidel’s 2D algorithm for LPs

Problem 5 - Linear Programming

Suppose we are given a sequence of $n$ linear inequalities of the form $a_i x + b_i y \leq c_i$, where $x, y$ are variables and $a_i, b_i, c_i \in \mathbb{R}$. Collectively, these $n$ inequalities describe a convex polygon $P$ in the plane.

  1. Describe a linear program whose solution describes the largest axis-aligned square that lies entirely inside $P$
  2. Describe a linear program whose solution describes the maximum-perimeter axis-aligned rectangle that lies entirely inside $P$
  3. Describe a linear program whose solution describes the largest circle that lies entirely inside $P$

Problem 6 - Linear Programming

Suppose we are given a sequence of $n$ linear inequalities of the form $a_i x + b_i y \leq c_i$, where $x, y$ are variables and $a_i, b_i, c_i \in \mathbb{R}$. Collectively, these $n$ inequalities describe a convex polygon $P$ in the plane.

  1. Describe a polynomial time algorithm to compute two interior-disjoint axis-aligned squares with maximum total perimeter that lie entirely inside $P$
  2. Describe a polynomial time algorithm to compute two interior-disjoint axis-aligned rectangles with maximum total perimeter that lie entirely inside $P$

Problem 7 - Linear Programming Duality (basic Helly’s theorem)

Given a finite set of halfspaces $H_i := \{ x \in \mathbb{R}^d \mid a_i^T x \leq b_i, \ a_i \in \mathbb{R}^d, \ b_i \in \mathbb{R} \}$, where $1 \leq i \leq n$ such that any set of $d+1$ halfspaces have a non-empty intersection, prove that there is a point lying in the intersection of all these halfspaces.

Previous
Next