Solutions to Homework 3
Problem 1
We begin with the observation that if a graph is $\varepsilon$-far from begin connected, then it has at least $\varepsilon \cdot n \cdot d$ connected components, otherwise it would require less than $\varepsilon \cdot n \cdot d$ edges to connect all vertices.
Thus, for this problem we can assume that $d \cdot \varepsilon \leq 1$, otherwise the problem is trivial, as we would only need to distinguish between a connected graph and a graph with no edges.
Additionally, we can also bound the size of a good fraction of connected components. In particular, letting $t$ be a parameter, we have that any graph has at most $n/t$ connected components of size at least $t$. To see this, if $r$ is the number of connected components of size at least $t$, by looking at the number of vertices we have $$ n \geq r \cdot t \implies r \leq \frac{n}{t} $$
Hence, we can use the above to bound the number of vertices in large connected components (i.e., connected components of size $\geq t$) in an $\varepsilon$-far graph. Since we have at least $\varepsilon \cdot n \cdot d$ connected components, and at most $n/t$ connected components of size at least $t$, we have that there are at least $\varepsilon \cdot n \cdot d - n/t$ connected components of size less than $t$. Thus, the number of vertices in large connected components is at most $n - \varepsilon \cdot n \cdot d + n/t$.
Hence, if we set $t = d/\varepsilon$, we have that any $\varepsilon$-far (from being connected) graph has at most $n/t = n \cdot \varepsilon/d$ connected components of size at least $t = d/\varepsilon$. Thus, most components will be small.
Now we are ready for our algorithm.
Algorithm:
- randomly sample $s$ vertices from the graph, and from each of these vertices, run BFS until we either stop or see at least $\lceil d/\varepsilon \rceil$ distinct vertices.
- if any BFS finds a connected component (i.e., it completes before seeing $\lceil d/\varepsilon \rceil$ distinct vertices), return FAIL
- otherwise, return PASS
Correctness Analysis: note that if the graph is connected, then the BFS will always find $\lceil d/\varepsilon \rceil$ distinct vertices, and thus the algorithm will always return PASS. On the other hand, if the graph is $\varepsilon$-far from being connected, then the algorithm will return FAIL iff we manage to sample a vertex from one of the $\varepsilon \cdot n \cdot d$ connected components of size less than $\lceil d/\varepsilon \rceil$.
Note that in an $\varepsilon$-far graph, there are at least $\varepsilon \cdot n \cdot d - n \cdot \varepsilon/d = \varepsilon \cdot n \cdot (d - 1/d)$ connected components of size less than $\lceil d/\varepsilon \rceil$. Hence, if we sample a vertex uniformly at random, the probability of NOT sampling a vertex in one of these components is at most $$ \dfrac{n - \varepsilon \cdot n \cdot (d - 1/d)}{n} = 1 - \varepsilon \cdot (d - 1/d) $$ If we set $s = \frac{2}{\varepsilon \cdot (d - 1/d)}$, we have that the probability of NOT sampling a vertex in one of these components is at most $$\left(1 - \varepsilon \cdot (d - 1/d) \right)^{\frac{2}{\varepsilon \cdot (d - 1/d)}} \leq e^{-2} < 1/4$$ Thus, our algorithm will fail with probability at most $1/4$.
Running time: The running time of this algorithm is given by $s$ times the running time of the truncated BFS, which is $O(d^2/\varepsilon)$, since for each visited vertex, we look at all its ($\leq d$) neighbors.
Problem 2
Part 1
If the transition matrix is irreducible, the underlying directed graph is strongly connected. Thus, from any vertex $j$ we can reach the vertex $i$ which has $P_{i,i} > 0$. In this case, if we have a path $(j, \alpha_1, \ldots, \alpha_k, i, \alpha_{k+1}, \ldots, \alpha_r, j)$ in our graph (from the Markov chain), we will also have that $(j, \alpha_1, \ldots, \alpha_k, i, i, \alpha_{k+1}, \ldots, \alpha_r, j)$ is another possible path taken by the Markov chain. Hence, any other vertex $j$ is aperiodic, which means that this Markov chain is aperiodic.
Part 2
Note that this Markov chain is irreducible, since from any vertex $(\alpha, \beta)$ we can reach any other vertex $(i,j)$ by first walking on the first coordinate $(i - \alpha) \bmod a$ steps and then walking on the second coordinate $(j - \beta) \bmod b$ steps.
Now, we need to show that this Markov chain is aperiodic iff $\gcd(a,b) = 1$.
By symmetry of the random walk, as we are walking over $\mathbb{Z}_a \times \mathbb{Z}_b$, we have that the period of the Markov chain is the same for all vertices. Thus, it is enough to prove the statement (on periodicity) for the vertex $(0,0)$.
We first prove that if $\gcd(a,b) \neq 1$, then the Markov chain is periodic. Let $d = \gcd(a,b)$, and note that at step $t$ we must be at a coordinate $(i,j)$ such that $i+j \equiv t \bmod d$. WE prove this by a simple induction: this holds for $t = 0$, since we start at $(0,0)$, and if it holds for $t$, then at step $t+1$ we have two cases: $(i+1 \bmod a) + j \bmod d \equiv i+j+1 \bmod d$ (as $d \mid a$), or $i + (j+1 \bmod b) \bmod d \equiv i+j+1 \bmod d$ (as $d \mid b$). Thus, we have that the period of the Markov chain is divisible by $d$, as to come back to the same vertex we need to take a number of steps which is multiple of $d$. Thus the Markov chain is periodic.
Now, we prove that if $\gcd(a,b) = 1$, then the Markov chain is aperiodic. To see this, let $r, s$ be integers such that $ar + bs = 1$. We can assume that $r > 0, s < 0$ (as otherwise we can just take $r’ = r + b, s’ = s - a$). Note that if we take $ar$ steps in the first coordinate, we will be back at $(0,0)$. Similarly, if we take $-bs$ steps in the second coordinate, we will be back at $(0,0)$. Since $ar = -bs + 1$ we have that our Markov chain is aperiodic.
Note: you can simplify the above proof to simply prove that the period of this Markov chain is $\gcd(a,b)$, and thus it is aperiodic iff $\gcd(a,b) = 1$.
Part 3
Note that this Markov chain is given by an undirected graph, as the King moves uniformly at random to one of the neighboring squares. Hence, this Markov chain is irreducible iff the graph is connected, which in this case it is, since the king can reach any square from any other square. Moreover, this Markov chain is aperiodic, since from any square the king can return to the same square in 2 steps (by moving to a neighboring square and then back to the original square), as well as in 3 steps (by moving diagonally to a neighboring square and then using two axis-aligned steps to return to the original square).
Let’s denote the positions of the chessboard by $(i,j)$, where $1 \leq i,j \leq 8$, and we index from bottom-left to top-right.
Hence, we have the following degrees for the vertices of the graph:
- $(1,1), (1,8), (8,1)$ and $(8,8)$ have degree 3
- $(1, a), (8, a), (a, 1)$ and $(a, 8)$ have degree 5, whenever $2 \leq a \leq 7$
- $(a,b)$ has degree 8, whenever $2 \leq a,b \leq 7$
And by the handshake lemma we have $2m = 4 \cdot 3 + 24 \cdot 5 + 36 \cdot 8 = 12 + 120 + 288 = 420$, where $m$ is the number of edges in the graph.
As we saw in class, the stationary distribution for the random walk in an undirected graph is given by $\pi_{i,j} = d_{i,j}/2m$, where $d_{i,j}$ is the degree of vertex $(i,j)$ and $m$ is the number of edges in the graph.
Plugging in the above values we have our answer.
Problem 3
For this problem, in both parts we have that the directed graph of the Markov Chain is irreducible and aperiodic. Thus, we are in good shape to apply the results from class.
Part 1
In this case, we simply have a Markov Chain on the complete graph $K_3$.
Stationary distribution: As we saw in class, the stationary distribution of this Markov Chain is given by $(d_1/2m, d_2/2m, d_3/2m)$. As in our case we have $d_1 = d_2 = d_3 = 2$ and $m = 3$, we have that the stationary distribution is $(1/3, 1/3, 1/3)$.
Eigenvalues: By the result from class, we also know that the eigenvalues of the transition matrix are $1, -1/(n-1), -1/(n-1)$, where $n$ is the number of vertices. Since $n = 3$, we have that the eigenvalues are $1, -1/2, -1/2$.
Eigenvectors: the normalized (and orthonormal) eigenvectors of the transition matrix are $$v_1 = \frac{1}{\sqrt 3} \cdot \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} \qquad v_2 = \frac{1}{\sqrt 2} \cdot \begin{pmatrix} 1 \\ -1 \\ 0 \end{pmatrix} \qquad v_3 = \frac{1}{\sqrt 6} \cdot \begin{pmatrix} 1 \\ 1 \\ -2 \end{pmatrix} $$ corresponding to the eigenvalues $1, -1/2, -1/2$, respectively.
Mixing time: Now we have 2 ways to proceed: one is to use prof. Lau’s notes and use the fact that the $\varepsilon$-mixing time is given by $\frac{1}{\lambda} \cdot \log(n/\varepsilon) $, where $\lambda$ is the spectral gap, defined as $\lambda := \min{ 1 - \lambda_2, 1 - |\lambda_n| }$ where $\lambda_2$ and $\lambda_n$ are the second largest and smallest eigenvalues of the transition matrix, respectively. In our case, we have that $\lambda = 1/2$, $n=3$, and thus the $\varepsilon$-mixing time is $2 \cdot \log(3/\varepsilon)$.
Probability of return: the probability of return of the $i^{th}$ state is simply given by the formula $e_i^\dagger A^t e_i$. The formula, given the eigenvalues and eigenvectors computed above, is given by: $$ e_i^\dagger A^t e_i = \frac{1}{3} + \frac{2}{3} \cdot \left( \frac{-1}{2} \right)^t $$
Note that the probability of return of the $i^{th}$ state is the same for all $i$, and thus we have that the probability of return of the Markov Chain is given by the above formula.
Part 2
In this case, our transition matrix is given by $$ A = \begin{pmatrix} 0 & 2/3 & 1/3 \\ 1/3 & 0 & 2/3 \\ 2/3 & 1/3 & 0 \end{pmatrix} $$
Eigenvalues: Note that the characteristic polynomial of this matrix is given by $p(t) = t^3 - 2t/3 - 1/3$, and thus the eigenvalues are the roots of this polynomial, which are $\lambda_1 = 1, \lambda_2 = \frac{-3 + \sqrt{3} i}{6}, \lambda_3 = \frac{-3 - \sqrt{3} i}{6}$. Thus, our matrix $A$ is diagonalizable.
Eigenvectors: since this matrix has 3 distinct eigenvalues, it also has 3 linearly independent eigenvectors, and thus we can find them by solving the system $(A - \lambda I) v = 0$ for each eigenvalue $\lambda.$ This gives us the following (normalized) eigenvectors:
$$ v_1 = \dfrac{1}{\sqrt{3}} \cdot \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} \qquad v_2 = \dfrac{1}{\sqrt{21}} \cdot \begin{pmatrix} \frac{-3 \sqrt{3} + i}{2} \\ \frac{\sqrt{3} - 5 i}{2} \\ \sqrt{3} + 2 i \end{pmatrix} \qquad v_3 = \dfrac{1}{\sqrt{21}} \cdot \begin{pmatrix} \frac{-3\sqrt{3} - i}{2} \\ \frac{\sqrt{3} + 5 i}{2} \\ \sqrt{3} - 2 i \end{pmatrix} $$
Hence, we have $$ A = \lambda_1 \cdot v_1 v_1^\dagger + \lambda_2 \cdot v_2 v_2^\dagger + \lambda_3 \cdot v_3 v_3^\dagger $$ where $v_i^\dagger$ is the conjugate transpose of $v_i$.
Stationary distribution: In this case, we have that the stationary distribution is given by $(1/3, 1/3, 1/3)$, as in the previous part.
(You could have guessed it, since the column sums are all 1, and thus the uniform distribution is a stationary distribution. However, the principled approach from the above is also good, and avoid the guessing.)
Mixing time: here we can again proceed as in the previous part, by using the theorem from prof. Lau’s notes on the $\varepsilon$-mixing time based on the eigenvalue gap, or we can just do it directly.
Since the eigenvalue gap in this case is $\lambda = 1 - |\lambda_2| = 1 - \frac{1}{\sqrt 3}$, we have that the $\varepsilon$-mixing time is $\frac{\sqrt 3}{\sqrt 3 - 1 } \cdot \log(3/\varepsilon)$.
Probability of return: Note that the probability of return of the $i^{th}$ state is simply given by the formula $e_i^\dagger A^t e_i$. The formula, given the eigenvalues and eigenvectors computed above, is given by: $$ e_i^\dagger A^t e_i = \sum_{j=1}^3 \lambda_j^t \cdot |v_j^\dagger e_i|^2 = \frac{1}{3} \cdot \left[ 1 + \left( \frac{-3 + \sqrt{3}i}{6} \right)^t + \left( \frac{-3 - \sqrt{3}i}{6} \right)^t \right] $$ where $v_j$ is the $j^{th}$ eigenvector of $A$ and $\lambda_j$ is the $j^{th}$ eigenvalue of $A$.
Note that the probability of return of the $i^{th}$ state is the same for all $i$, and thus we have that the probability of return of the Markov Chain is given by the above formula.
Problem 4
See piazza resources.
Problem 5
See piazza resources.
Problem 6
The dual of this program is
$$ \min \sum_{e \in E} c_e x_e $$
subject to
$$
\begin{cases}
\sum_{e \in P} x_e \geq 1 & \forall \text{ s-t paths } P \\
x_e \geq 0 & \forall e \in E
\end{cases}
$$
Note that if the variables $x_e \in {0,1}$, then this integer program computes the minimum cost s-t cut of the graph. The dual program that we computed above is just a relaxation of this integer program, and note that we can assume that $x_e \in [0,1]$ since $x_e > 1$ will never appear in an optimal solution. To see this last point, note that if $x_e > 1$, then we can always substitute $x_e = 1$ and get a feasible solution with a lower cost.
Also note that the program above is feasible, as well as the primal program, and therefore we have by strong duality that the LP above will achieve its minimum value. Moreover, by the max-flow min-cut theorem and strong duality, the minimum value of the LP above is equal to the maximum values of the primal, which is the value of the max s-t flow, and thus, the max-flow min-cut theorem guarantees us that the minimum of the LP above will be the minimum cost s-t cut of the graph.