Homework 3
This homework covers the material from lectures 13 to 18.
Due date: June 26th, 10pm Waterloo time.
LaTeX template, in case you want to write it in LaTeX.
Required Exercises
Problem 1 - Corner Points of Minimum Vertex Cover Relaxation (15 points)
Given the LP relaxation for minimum vertex-cover:
\begin{align*} \min \ \ &\sum_{v \in V} c_v \cdot x_v \\ \text{s.t.}\ \ 0 \leq \ &x_v \leq 1 \ \text{ for all } v \in V \\ x_u + \ &x_v \geq 1 \ \text{ for all } \{u,v\} \in E \end{align*}
-
Let $y$ be any feasible solution for the LP. Define another solution $y^+$ by: $$y^+_v = \begin{cases} y_v + \varepsilon \ \text{ if } 1/2 < y_v < 1, \\ y_v - \varepsilon \ \text{ if } 0 < y_v < 1/2, \\ y_v\ \ \ \ \ \ \text{ if } y_v \in \{0, \frac{1}{2}, 1\}. \end{cases} $$ Similarly define the solution $y^-_v$, by replacing $\varepsilon$ with $-\varepsilon$.
Prove that one can find $\varepsilon > 0$ such that both $y^+, y^-$ are feasible for the LP above.
-
Show that every extreme point $z$ of the LP above is half-integral, that is $z_v \in \{0, \frac{1}{2}, 1\}$ for all $v \in V$.
-
Based on the previous parts, design a 2-approximation algorithm for minimum vertex cover.
Problem 2 - Integer Linear Programming Rounding (15 points)
Consider the following maximum covering problem. Given a graph $G$ and a given number $k$, find a subset of $k$ vertices that touches the maximum number of edges.
Let $OPT(G, k)$ be the optimal number of edges touched in $G$ by a set of at most $k$ vertices.
Design an integer programming formulation for the problem, and then find a randomized rounding procedure for the corresponding linear programming relaxation, such that for given $G$ and $k$, it identifies a set of at most $2k$ vertices that touches at least $c \cdot OPT(G, k)$ edges, for some constant $c > 0$.
Your algorithm should always output a set of at most $2k$ vertices, and succeed with probability at least $1/2$.
Problem 3 - Strong Duality in SDPs (20 points)
-
Let $\alpha \geq 0$ and consider the following SDP: \begin{align*} \text{minimize } \ \ &\alpha \cdot X_{11} \\ \text{s.t.}\ \ X_{22} &= 0, \\ X_{11} &+ 2 \cdot X_{23} = 1, \\ X &\succeq 0 \end{align*}
Where $X$ is a $3 \times 3$ symmetric matrix. Prove that the dual of the SDP above is: \begin{align*} \text{maximize } \ \ &y_2 \\ \text{s.t.}\ \
\begin{pmatrix} y_2 & 0 & 0 \\ 0 & y_1 & y_2 \\ 0 & y_2 & 0 \end{pmatrix} &\preceq \begin{pmatrix} \alpha & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \end{align*} -
What is the value of the first SDP of part (1)?
-
What is the value of the dual (second SDP) of part (1)?
-
Now consider the following SDP: \begin{align*} \text{minimize } \ \ &x \\ \text{s.t.}\ \
\begin{pmatrix} x & 1 \\ 1 & y \end{pmatrix} &\succeq 0 \end{align*} Compute its dual program. -
Is the primal from part (4) strictly feasible? Is the dual strictly feasible?
-
What can you say about strong duality of the SDPs of parts 1 and 4? Are the results consistent with Slater conditions presented in class?
Problem 4 - Challenge to a Dual (20 points)
Let $G(V,E)$ be a graph, where $n = |V|$. For each $i \in V$, let $v_i \in \mathbb{R}^n$ be a vector variable associated to vertex $i$. Consider the following SDP:
\begin{align*} \min \ \ &t \\ \text{s.t.}\ \ \langle v_i , v_j \rangle &= t \ \text{ for all } i \neq j \in V, \ \{i,j\} \not\in E \\ \langle v_i, v_i \rangle &= 1 \ \text{ for all } i \in V \end{align*}
Let $\Theta \in \mathbb{R}$ be the optimum value of the SDP above.
-
Show that the following SDP has optimal value $- \Theta$: \begin{align*} \max \ \ &\langle X, e_{n+1} e_{n+1}^T \rangle \\ \text{s.t.}\ \ \langle X , e_i e_j^T \ &+ e_{n+1} e_{n+1}^T \rangle = 0 \ \text{ for all } i \neq j \in V, \ \{i,j\} \not\in E \\ \langle X , e_i e_i^T \rangle &= 1 \ \text{ for all } i \in V \\ &X \succeq 0 \end{align*} Where $X$ is a $(n+1) \times (n+1)$ symmetric matrix, and $e_1, \ldots, e_{n+1}$ are the elementary unit vectors.
-
Write down the dual of the SDP from part (a).
-
Conclude that the dual you just derived is equivalent to the following SDP:
\begin{align*} \min \ \ &\sum_{1 \leq i \leq n} Z_{ii} \\ \text{s.t.}\ \ &Z_{ij} = 0 \ \text{ for all } i \neq j \in V, \ \{i,j\} \in E \\ &\sum_{i \neq j} Z_{ij} \geq 1 \\ &Z \succeq 0 \end{align*}
Where $Z$ is an $n \times n$ symmetric matrix.
-
Rearrange the above SDP to show that the following SDP have value $\dfrac{\Theta-1}{\Theta}$:
\begin{align*} \max \ \ &\sum_{i,j \in V} Y_{ij} \\ \text{s.t.}\ \ &Y_{ij} = 0 \ \text{ for all } i \neq j \in V, \ \{i,j\} \in E \\ &\sum_{1 \leq i \leq n} Y_{ii} = 1 \\ &Y \succeq 0 \end{align*}
Where $Y$ is an $n \times n$ symmetric matrix.
Problem 5 - SDP relaxation (10 points)
Given a 3-uniform hypergraph $G(V, E)$ (that is, a hypergraph where each hyperedge has exactly 3 vertices), we say that a 2-coloring of $V$ is valid for a hyperedge $e = \{ a, b, c\} \in E$ if the hyperedge $e$ is not monochromatic upon this coloring.
The Max-2C3U problem is the following:
- Input: a 3-uniform hypergraph $G(V, E)$
- Output: a 2-coloring of the vertices of $G$ of maximum value, that is, a function $f : V \rightarrow \{ -1, 1 \}$ (the coloring) which maximizes the number of valid hyperedges.
In this question, you are asked to:
2.1. Write the optimization problem above as a quadratic program
2.2. Formulate an SDP relaxation for the problem, and prove that it is in fact a relaxation
Problem 6 - Competitive Analysis (20 points)
You have reached a river (modelled as a straight line) and must find a bridge to cross it. The bridge is at some integer coordinate upstream or downstream.
- Give a 9-competitive deterministic algorithm for optimizing the total distance travelled up and downstream before you find the bridge.
- Give a randomized 7-competitive algorithm for the problem
Problem 7 - Multiplicative Weights Update (20 points)
Here is a variation on the deterministic Weighted-Majority algorithm, designed to make it more adaptive.
- Each expert begins, on the 1st day, with weight $1$.
- We predict the result of a weighted-majority vote of the experts.
- If an expert makes a mistake, we penalize it by dividing its weight by $2$, but only if its weight was at least $1/4$ of the average weight of experts.
Prove that in any contiguous block of trials (e.g., the block from 51st day through the 77th day), the number of mistakes made by the algorithm is at most $O(m + \log n)$, where $m$ is the number of mistakes made by the best expert in that block, and $n$ is the total number of experts.
Practice Problems
You are not required to submit the solution to these problems, but highly encouraged to attempt them for your own knowledge sake. :)
Problem - Integer Linear Programming Rounding (Advanced Problem)
Given a hypergraph $G(V, E)$ where each hyperedge $e \in E$ is a subset of $V$ , our goal
is to color the vertices of $G$ using $\{ -1, +1 \}$ such that each hyperedge is as balanced as possible.
Formally, given a coloring $\gamma : V \to \{-1, +1\}$ on the vertices, we define
$$ \Delta(e) = \sum_{v \in e} \gamma(v) $$
and
$$\Delta(G) = \max_{e \in E} |\Delta(e)|.$$
Prove that if the maximum degree of the hypergraph is $d$ (i.e. each vertex
appears in at most $d$ hyperedges), then there is a coloring with
$$\Delta(G) \leq 2d - 1.$$
Hint: You may find it useful to consider the following LP, where initially we set $B_e = 0$ for all $e \in E$.
\begin{align*} &\sum_{v \in e} x_v = B_e \ \text{ for all } \ e \in E \\ &-1 \leq x_v \leq 1 \ \text{ for all } \ v \in V \end{align*}
Guidance on how to use the hint:
- The main approach will be to iteratively update the LP above to try and find our desired solution. At each iteration, we will use a solution of the LP to guide us.
- What does the LP given in the hint give us? What is captured by solutions to the LP?
- What are the basic solutions to the LP above? How do we find a basic solution?
- Now that we have a basic solution, and know how they look, we can prove the following consequence of basic solutions $$x \text{ is basic } \implies \{A_i\}_{i \in \text{supp}^*(n)} \text{ is L.I. where } \text{supp}^*(x) = \{i \in [n] \mid x_i \in (-1, 1)\}$$
- What happens if a coordinate is not in $\text{supp}^*(x)$?
- What happens when we cannot update the LP?
- How do we get unstuck? Hint: Throw away a constraint. Which one?
- At most how many vertices will THE deleted hyperedge have?
- What is the worst that can happen to an edge in this process?
Problem - SDP standard form
Transform these semidefinite programs into standard form.
Part 1
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & A_1 X A_1^T = B, \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}, \quad B = \begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix}. $$
Part 2
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & A_1 X A_1^T + A_2 X A_2^T \preceq B, \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 1 & 0 \\ 0 & 3 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad B = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}. $$
Part 3
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & A_1 X A_1^T \succeq B_1, \\ & A_2 X A_2^T \preceq B_2, \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad B_1 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad B_2 = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}. $$
Part 4
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & A_1 X A_1^T + A_2 X A_2^T + A_3 X A_3^T = B, \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad A_3 = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}, \quad B = \begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix}. $$
Problem - SDP duality
Compute the duals of the following SDPs:
Part 1
$$ \begin{align*} \text{Minimize} \quad & \left\langle \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, X \right\rangle \\ \text{subject to} \quad & \left\langle \begin{pmatrix} 2 & 1 \\ 1 & 1 \end{pmatrix}, X \right\rangle \geq 1, \\ & \left\langle \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, X \right\rangle \leq 2, \\ & X \succeq 0, \end{align*} $$
Part 2
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & \text{Tr}(A_i X) = b_i, \quad i = 1, 2 \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad \text{and} \quad b = \begin{bmatrix} 1 \\ 0 \end{bmatrix}. $$
Part 3
Consider the following semidefinite program:
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & \text{Tr}(A_i X) \leq b_i, \quad i = 1, 2, 3 \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 1 & 0 \\ 0 & 3 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad A_3 = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}, \quad \text{and} \quad b = \begin{bmatrix} 2 \\ 1 \\ 1 \end{bmatrix}. $$
Part 4
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & \text{Tr}(A_1 X) \geq 1, \\ & \text{Tr}(A_2 X) \leq 2, \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}. $$
Part 5
$$ \begin{align*} \text{Minimize} \quad & \text{Tr}(CX) \\ \text{subject to} \quad & \text{Tr}(A_i X) = b_i, \quad i = 1, 2, 3 \\ & X \succeq 0, \end{align*} $$
where
$$ C = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad A_1 = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad A_3 = \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}, \quad \text{and} \quad b = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix}. $$
Problem - Projections of Spectrahedra (semidefinite representations)
The $k$-ellipse with foci $(u_1, v_1), \ldots, (u_k, v_k) \in \mathbb{R}^2$ and radius $d \in \mathbb{R}$ is the following curve in the plane: $$ \left\{ (x,y) \in \mathbb{R}^2 \ \bigm| \ \sum_{i=1}^k \sqrt{(x-u_i)^2 + (y-v_i)^2} = d \right\} $$
Let $\mathcal{C}_k$ be the region whose boundary is the $k$-ellipse. That is, $\mathcal{C}_k$ is the set: $$ \left\{ (x,y) \in \mathbb{R}^2 \ \bigm| \ \sum_{i=1}^k \sqrt{(x-u_i)^2 + (y-v_i)^2} \leq d \right\} $$
Find a semidefinite representation (i.e. projection of a spectrahedron) of the set $\mathcal{C}_k$, and prove that your semidefinite representation is correct. That is, that it captures exactly the set $\mathcal{C}_k$ above.
Problem - Spectrahedra and Projections (Advanced Problem)
This problem requires basic knowledge of real analysis and topology.
-
Both spectrahedra and projected spectrahedra are convex sets. Show that spectrahedra are always closed sets.
-
Spectrahedra are always closed sets. That is, if a sequence of points $\{p_n\}_{n \geq 0}$ in the spectrahedron converges to a point $p$, then $p$ is also in the spectrahedron.
Find an example of projected spectrahedron which is not a closed set.
Problem - Paging Algorithms
A conservative algorithm is one that makes at most $k$ page faults on any consecutive subsequence of the input that contains at most $k$ distinct pages. Here $k$ is the size of the cache.
- Prove that LRU and FIFO are conservative
- Prove that any conservative algorithm is k-competitive.
Problem - Online Algorithms
You are managing a system with $n$ identical servers. You will receive a sequence of $m$ requests, denoted by $r_1, \ldots, r_m$ that need to be scheduled to be solved by the servers. Each request $r_i$ takes $p_i$ time to be processed by any server, and we don’t know these processing times $p_i$ in advance (we only know the processing time when we receive the request). At time $i$, we receive request $r_i$ which should be immediately assigned to one of the $n$ servers.
The load of a server is the sum of all the processing times of the requests assigned to it. For instance, if we assigned requests $1, 2$ and $5$ to server $1$, then the load at server $1$ is $\ell(1) = p_1 + p_2 + p_5$.
Let $M(t)$ be the load of the server which has the maximum load at time $t$ under our scheduling, and let $M^*(t)$ be the load of the server with maximum load under an offline, optimal scheduling, at time $t$.
Devise an algorithm such that $\dfrac{M(t)}{M^*(t)} \leq 2$ for all $t \geq 1$, i.e. has competitive ratio $\leq 2$.
You can assume that your input will be given by a sequence $(r_i, p_i)$ for $i \in [m]$.
Problem - Multiplicative Weights
Summer is here, and you would like to go to the beach whenever you can. To achieve that, you need to predict the weather (whether it will be a sunny or a rainy day) in your city. You have access to forecasts from $n$ different weather services, and you don’t know in advance how good each weather service is. At the end of each day, each service predicts whether it will be sunny or rainy the next day.
Your goal is to devise a strategy to combine the forecasts of the weather services and make your own prediction (sunny/rainy) each day. You want to minimize your total error, after all, you would like to maximize the number of days that you go to the beach.
Assuming that you have $T$ summer days, devise a weather prediction algorithm and analyze its performance against the best weather service.
Problem - Multiplicative Weights
Consider a collection of faculty available to give you advice about your future path in life. Since they are busy, each only answers yes or no questions. Some are wiser than others and are more often correct in their advice. You would like, looking back, to have taken the advice of the wisest faculty member, but ahead of time, you don’t know who that is. Consider the following online algorithm: each time you have a question, you ask all the faculty and go with a “weighted majority” opinion as follows:
- Initially, each faculty member has a weight of 1.
- After asking a question, take the (yes/no) answer of larger total weight.
- Upon discovering which answer is correct, halve the weight of each faculty member who answered incorrectly.
You will show that, using this algorithm, you make at most $2.41(m + \log n)$ mistakes, where $m$ is the number of mistakes made by the wisest faculty, and $n$ is the number of faculty. (Thus, in the language of asymptotic competitive ratios, you are $2.41$-competitive against asking the best faculty member.)
Part 1
Prove that, when the online algorithm makes a mistake, the total weight of the faculty decreases by a factor of $\frac{4}{3}$. Use this to upper bound the total weight assigned to faculty.
Part 2
Lower-bound the weight assigned to faculty by considering the weight of the wisest faculty member in terms of $m$.
Part 3
Combine the above two parts to prove the claim.
With randomization, you can do better. We modify the online algorithm as follows:
- Instead of going with the majority, choose one faculty member with probability proportional to their weight, and follow their opinion.
- Instead of multiplying incorrect faculty weights by $\frac{1}{2}$, multiply them by some $\beta < 1$ to be determined later.
Part 4
For a given question, let $F$ denote the fraction of the weight of faculty with the wrong answer. Give an expression for the multiplicative change in the total weight as a result of reweighing for this question.
Part 5
Arguing much as in the deterministic case, prove that the (expected) number of wrong answers will be at most $$ \frac{m \ln(1/\beta) + \ln n}{1 - \beta} $$ so for example, setting $\beta = \frac{1}{2}$ gives $1.39m + 2 \ln n$, for a competitive ratio of $1.39$.
Part 6
Show that if one uses the “right” $\beta$, one can limit the number of errors to $m + \ln n + O(\sqrt{m \ln n})$, achieving a $1$-competitive algorithm (in the asymptotic sense). (This $\beta$ can actually be found online by “repeated doubling” as the mistakes build up.)
Problem - Multiplicative Weights and Max-Flow (More advanced)
Consider the maximum flow problem from $s$ to $t$ on a directed graph, where each edge has capacity one. A fractional $s$-$t$ flow solution with value $k$ is an assignment of each edge $e$ to a fractional value $x(e) \in \mathbb{R}$, satisfying
$$ \begin{align*} \sum_{e \in \delta^{out}(s)} x(e) &= \sum_{e \in \delta^{in}(t)} x(e) = k \\ \sum_{e \in \delta^{in}(s)} x(e) &= \sum_{e \in \delta^{out}(t)} x(e) = 0 \\ \sum_{e \in \delta^{in}(v)} x(e) &= \sum_{e \in \delta^{out}(v)} x(e) \ \ \forall v \in V \setminus \{ s, t \} \\ 0 &\leq x(e) \leq 1, \ \ \forall e \in E \end{align*} $$
Use the multiplicative weights update method to solve this Linear Program by reducing the flow problem to the problem of finding shortest paths between $s$ and $t$. Analyze the convergence rate and total complexity of your algorithm to compute a flow of value $k \cdot (1-\epsilon)$ for any $\epsilon > 0$.