Semidefinite Programming, Duality Theorems & SDP Relaxations

Background on Symmetric Matrices

A matrix $A \in \mathbb{R}^{m \times m}$ is symmetric if $A = A^T$. We denote the set of symmetric matrices as $\mathbb{S}^m$. We can think of $\mathbb{S}^m$ as a vector space with the inner product $$\langle A, B \rangle := \text{tr}(A^T B) = \text{tr}(AB) = \sum_{i,j} A_{ij} B_{ij}.$$ This is the Frobenius inner product, which corresponds to the usual Euclidean inner product when we vectorize the matrices. From the inner product above, we can also define the Frobenius norm $$|A|_F := \sqrt{\langle A, A \rangle} = \sqrt{\text{tr}(A^T A)}.$$


Standard Form of SDP

Just as with linear programs, we can write semidefinite programs in standard form. The standard form of an SDP is $$ \begin{align*} \text{minimize} \quad & \langle C, X \rangle \ \text{subject to} \quad & \langle A_i, X \rangle = b_i, \quad i = 1, \ldots, t \ & X \succeq 0, X \in \mathbb{S}^m. \end{align*} $$

In the above program, the variables are the entries of the symmetric matrix $X \in \mathbb{S}^m$. Note the similarity to the standard form of a linear program. We can obtain the LP standard form by making $X, C, A_i$ diagonal matrices.

However, this does not look like the way in which we defined SDPs in the previous lecture (via LMIs). So, how is the above program equivalent to the LMIs we saw in the previous lecture?


Equivalence of SDP Standard Form and LMIs

We can write the SDP standard form as a system of linear matrix inequalities (LMIs), and vice-versa.

SDP Standard Form as LMIs

Suppose we have an SDP in standard form $$ \begin{align*} \text{minimize} \quad & \langle C, X \rangle \ \text{subject to} \quad & \langle A_k, X \rangle = b_k, \quad k = 1, \ldots, t \ & X \succeq 0, X \in \mathbb{S}^m. \end{align*} $$

Note that we can consider the equality constraints as two inequalities, i.e., $\langle A_i, X \rangle \leq b_i$ and $\langle A_i, X \rangle \geq b_i$. Now, we can group all of these constraints into one LMI by putting these constraints in a diagonal of a matrix.

Let $B_{ij} \in \mathbb{S}^{2t}$ be a diagonal matrix given by $$ (B_{ij}){kk} = \begin{cases} 2 \cdot (A_k){ij}, & k \leq t \ -2 \cdot (A_k){ij}, & k > t. \end{cases} $$ and let $\Gamma \in \mathbb{S}^{2t}$ be a diagonal matrix given by $$ (\Gamma){kk} = \begin{cases} b_k, & k \leq t \ -b_k, & k > t. \end{cases} $$

Then, we can write the SDP standard form as follows $$ \begin{align*} \text{minimize} \quad & \langle C, X \rangle \ \text{subject to} \quad & \sum_{i \leq j} x_{ij} B_{ij} \succeq \Gamma \ & X \succeq 0, X \in \mathbb{S}^m. \end{align*} $$

LMIs in SDP Standard Form

Now, suppose that we have an SDP with constraints given as an LMI: $$ \begin{align*} \text{minimize} \quad & \sum_{i=1}^n c_i x_i \ \text{subject to} \quad & \sum_{i=1}^n x_{i} B_{i} \succeq \Gamma \ & x \in \mathbb{R}^n. \end{align*} $$ where $B_i, \Gamma \in \mathbb{S}^m$ are symmetric matrices.

Before we convert it to the standard form, we can write $x_i = u_i - v_i$, where $u_i, v_i \geq 0$. Then, we can rewrite the above program as $$ \begin{align*} \text{minimize} \quad & \sum_{i=1}^n c_i (u_i - v_i) \ \text{subject to} \quad & \sum_{i=1}^n (u_i - v_i) B_{i} \succeq \Gamma \ & u_i, v_i \geq 0, \quad i = 1, \ldots, n. \end{align*} $$

Now, we can write the above SDP in standard form by defining the variable symmetric matrix $Y \in \mathbb{S}^{2n + m}$ and equality constraints given by $A_{ij} \in \mathbb{S}^{2n + m}$ as follows: $$ \begin{align*} &Y \succcurlyeq 0, \ &\langle E_{ij} + E_{ji}, Y \rangle = 0, \quad i \neq j \in [2n] \ &\langle E_{i,i} - E_{(n+i), (n+i)}, Y \rangle = 0, \quad i \in [n] \ &\langle A_{ij}, X \rangle = \Gamma_{ij}, \quad i \leq j. \end{align*} $$


Duality Theorems

We will now discuss the duality theory for semidefinite programs.

Weak Duality

Consider the primal SDP $$ \begin{align*} \text{minimize} \quad & \langle C, X \rangle \ \text{subject to} \quad & \langle A_i, X \rangle = b_i, \quad i = 1, \ldots, t \ & X \succeq 0, X \in \mathbb{S}^m. \end{align*} $$

If we look at what happens when we multiply the $i^{\text{th}}$ constraint by $y_i$ and sum over all constraints, we get $$ \begin{align*} \sum_{i=1}^t y_i \langle A_i, X \rangle &= \sum_{i=1}^t y_i b_i \Rightarrow \ \left\langle \sum_{i=1}^t y_i A_i, \ X \right\rangle &= y^T b. \end{align*} $$

Thus, if $\sum_{i=1}^t y_i A_i \preceq C$, then we have $$ y^T b = \left\langle \sum_{i=1}^t y_i A_i, \ X \right\rangle \leq \langle C, X \rangle. $$ this is telling us that $y^T b$ is a lower bound on the optimal value of the primal SDP.

Thus, if we define the dual SDP as $$ \begin{align*} \text{maximize} \quad & y^T b \ \text{subject to} \quad & \sum_{i=1}^t y_i A_i \preceq C \ & y \in \mathbb{R}^t, \end{align*} $$ then we have the following weak duality theorem:


Weak Duality Theorem: For any primal feasible solution $X$ and dual feasible solution $y$, we have $$ y^T b \leq \langle C, X \rangle. $$


Strong Duality

In a similar way to linear programs, strong duality holds for semidefinite programs, albeit with some additional conditions, known as Slater’s conditions.

In the homework, you will see that strong duality may not hold for all feasible primal and dual SDPs.

However, if we assume that both the primal and dual SDPs are strictly feasible (this is Slater’s condition), then strong duality holds.

  • The primal SDP is strictly feasible if there exists an $X \succ 0$ such that $\langle A_i, X \rangle = b_i$ for all $i$.
  • The dual SDP is strictly feasible if there exists a $y$ such that $\sum_{i=1}^t y_i A_i \prec C$.

Strong Duality Theorem: If the primal and dual SDPs are strictly feasible (i.e., if Slater’s condition holds), then the optimal values of the primal and dual SDPs are equal.


Complementary Slackness

Just as with linear programs, we have complementary slackness for SDPs.


Complementary Slackness Theorem: If $X$ and $y$ are primal and dual feasible solutions, respectively, then $X$ and $y$ are optimal if and only if the complementary slackness condition hold: $$ \left( C - \sum_{i=1}^t y_i A_i \right) X = 0. $$


SDP Relaxations

In a similar manner that we used LP relaxations to obtain approximate solutions to NP-hard problems, via the formulation of such problems as integer linear programs, we can use SDP relaxations to obtain approximate solutions to NP-hard problems.

Since we can formulate any NP-complete problem as an integer linear program, given a combinatorial optimization problem coming from an NP-complete problem, we can always cast it as an ILP. Hence, a question arises: why use SDP relaxations instead of LP relaxations? Do we gain anything by using SDP relaxations, instead of LP relaxations?

Today, and in the next lecture, we will see that SDP relaxations can be more powerful than LP relaxations! Moreover, this has been a very fruitful area of research in the last 30 years, with many beautiful results (for those looking for a final project).

Quadratic Programming

A quadratic program (QP) is an optimization problem of the form $$ \begin{align*} \text{minimize} \quad & \frac{1}{2} x^T Q x + c^T x \ \text{subject to} \quad & q_i(x) \geq 0, \quad i = 1, \ldots, t \ & x \in \mathbb{R}^n. \end{align*} $$ where $Q \in \mathbb{S}^n$ is a symmetric matrix, and $q_i(x)$ are quadratic functions of $x$.

An advantage of studying QPs is that they are a very expressive class of optimization problems (generalizing ${0,1}$-ILPs). However, a disatvantage is that they are NP-hard to solve in general.

Nevertheless, we can relax QPs to SDPs, and thus we can use the same template as we used for LP relaxations to obtain approximate solutions to QPs! And as we will see in this and the next lecture, SDP relaxations can be more powerful than LP relaxations.

Main Example: Max-Cut

The Max-Cut problem is defined as follows:


Max-Cut Problem: Given a graph $G = (V, E)$, find a cut $(S, V \setminus S)$ that maximizes the number of edges between $S$ and $V \setminus S$, i.e., the number of edges across the cut, which is denoted by $|E(S, V \setminus S)|$.


While the minimum cut problem can be solved in polynomial time, the Max-Cut problem is NP-hard.

Previous