Semidefinite Programming, Duality Theorems & SDP Relaxations

Background on Symmetric Matrices

A matrix $A \in R^{m \times m}$ is symmetric if $A = A^{T}$ . We denote the set of symmetric matrices as $S^{m}$ . We can think of $S^{m}$ as a vector space with the inner product $⟨ A, B ⟩ := tr (A^{T} B) = tr (A B) = \sum_{i, j} A_{i j} B_{i j} .$ This is the Frobenius inner product, which corresponds to the usual Euclidean inner product when we vectorize the matrices. From the inner product above, we can also define the Frobenius norm $| A |_{F} := \sqrt{⟨ A, A ⟩} = \sqrt{tr (A^{T} A)} .$

Standard Form of SDP

Just as with linear programs, we can write semidefinite programs in standard form. The standard form of an SDP is $\begin{aligned} minimize & ⟨ C, X ⟩ subject to & ⟨ A_{i}, X ⟩ = b_{i}, i = 1, \dots, t & X ⪰ 0, X \in S^{m} . \end{aligned}$

In the above program, the variables are the entries of the symmetric matrix $X \in S^{m}$ . Note the similarity to the standard form of a linear program. We can obtain the LP standard form by making $X, C, A_{i}$ diagonal matrices.

However, this does not look like the way in which we defined SDPs in the previous lecture (via LMIs). So, how is the above program equivalent to the LMIs we saw in the previous lecture?

Equivalence of SDP Standard Form and LMIs

We can write the SDP standard form as a system of linear matrix inequalities (LMIs), and vice-versa.

SDP Standard Form as LMIs

Suppose we have an SDP in standard form $\begin{aligned} minimize & ⟨ C, X ⟩ subject to & ⟨ A_{k}, X ⟩ = b_{k}, k = 1, \dots, t & X ⪰ 0, X \in S^{m} . \end{aligned}$

Note that we can consider the equality constraints as two inequalities, i.e., $⟨ A_{i}, X ⟩ \leq b_{i}$ and $⟨ A_{i}, X ⟩ \geq b_{i}$ . Now, we can group all of these constraints into one LMI by putting these constraints in a diagonal of a matrix.

Let $B_{i j} \in S^{2 t}$ be a diagonal matrix given by $$ (B_{ij}){kk} = \begin{cases} 2 \cdot (A_k){ij}, & k \leq t \ -2 \cdot (A_k){ij}, & k > t. \end{cases} $a n d l e t $ Γ \in S^{2 t} $ b e a d i a g o n a l m a t r i x g i v e n b y$ (\Gamma){kk} = ${\begin{cases} b_{k}, & k \leq t - b_{k}, & k > t . \end{cases}$ $$

Then, we can write the SDP standard form as follows $\begin{aligned} minimize & ⟨ C, X ⟩ subject to & \sum_{i \leq j} x_{i j} B_{i j} ⪰ Γ & X ⪰ 0, X \in S^{m} . \end{aligned}$

LMIs in SDP Standard Form

Now, suppose that we have an SDP with constraints given as an LMI: $\begin{aligned} minimize & \sum_{i = 1}^{n} c_{i} x_{i} subject to & \sum_{i = 1}^{n} x_{i} B_{i} ⪰ Γ & x \in R^{n} . \end{aligned}$ where $B_{i}, Γ \in S^{m}$ are symmetric matrices.

Before we convert it to the standard form, we can write $x_{i} = u_{i} - v_{i}$ , where $u_{i}, v_{i} \geq 0$ . Then, we can rewrite the above program as $\begin{aligned} minimize & \sum_{i = 1}^{n} c_{i} (u_{i} - v_{i}) subject to & \sum_{i = 1}^{n} (u_{i} - v_{i}) B_{i} ⪰ Γ & u_{i}, v_{i} \geq 0, i = 1, \dots, n . \end{aligned}$

Now, we can write the above SDP in standard form by defining the variable symmetric matrix $Y \in S^{2 n + m}$ and equality constraints given by $A_{i j} \in S^{2 n + m}$ as follows: $\begin{aligned} Y ≽ 0, & ⟨ E_{i j} + E_{j i}, Y ⟩ = 0, i \neq j \in [2 n] & ⟨ E_{i, i} - E_{(n + i), (n + i)}, Y ⟩ = 0, i \in [n] & ⟨ A_{i j}, X ⟩ = Γ_{i j}, i \leq j . \end{aligned}$

Duality Theorems

We will now discuss the duality theory for semidefinite programs.

Weak Duality

Consider the primal SDP $\begin{aligned} minimize & ⟨ C, X ⟩ \\ subject to & ⟨ A_{i}, X ⟩ = b_{i}, i = 1, \dots, t \\ X ⪰ 0, X \in S^{m} . \end{aligned}$

If we look at what happens when we multiply the $i^{th}$ constraint by $y_{i}$ and sum over all constraints, we get $\begin{aligned} \sum_{i = 1}^{t} y_{i} ⟨ A_{i}, X ⟩ & = \sum_{i = 1}^{t} y_{i} b_{i} \Rightarrow \\ ⟨ \sum_{i = 1}^{t} y_{i} A_{i}, X ⟩ & = y^{T} b . \end{aligned}$

Thus, if $\sum_{i = 1}^{t} y_{i} A_{i} ⪯ C$ , then we have $y^{T} b = ⟨ \sum_{i = 1}^{t} y_{i} A_{i}, X ⟩ \leq ⟨ C, X ⟩ .$ this is telling us that $y^{T} b$ is a lower bound on the optimal value of the primal SDP.

Thus, if we define the dual SDP as $\begin{aligned} maximize & y^{T} b \\ subject to & \sum_{i = 1}^{t} y_{i} A_{i} ⪯ C \\ y \in R^{t}, \end{aligned}$ then we have the following weak duality theorem:

Weak Duality Theorem: For any primal feasible solution $X$ and dual feasible solution $y$ , we have $y^{T} b \leq ⟨ C, X ⟩ .$

Strong Duality

In a similar way to linear programs, strong duality holds for semidefinite programs, albeit with some additional conditions, known as Slater’s conditions.

In the homework, you will see that strong duality may not hold for all feasible primal and dual SDPs.

However, if we assume that both the primal and dual SDPs are strictly feasible (this is Slater’s condition), then strong duality holds.

The primal SDP is strictly feasible if there exists an $X ≻ 0$ such that $⟨ A_{i}, X ⟩ = b_{i}$ for all $i$ .
The dual SDP is strictly feasible if there exists a $y$ such that $\sum_{i = 1}^{t} y_{i} A_{i} ≺ C$ .

Strong Duality Theorem: If the primal and dual SDPs are strictly feasible (i.e., if Slater’s condition holds), then the optimal values of the primal and dual SDPs are equal.

Complementary Slackness

Just as with linear programs, we have complementary slackness for SDPs.

Complementary Slackness Theorem: If $X$ and $y$ are primal and dual feasible solutions, respectively, then $X$ and $y$ are optimal if and only if the complementary slackness condition hold: $(C - \sum_{i = 1}^{t} y_{i} A_{i}) X = 0.$

SDP Relaxations

In a similar manner that we used LP relaxations to obtain approximate solutions to NP-hard problems, via the formulation of such problems as integer linear programs, we can use SDP relaxations to obtain approximate solutions to NP-hard problems.

Since we can formulate any NP-complete problem as an integer linear program, given a combinatorial optimization problem coming from an NP-complete problem, we can always cast it as an ILP. Hence, a question arises: why use SDP relaxations instead of LP relaxations? Do we gain anything by using SDP relaxations, instead of LP relaxations?

Today, and in the next lecture, we will see that SDP relaxations can be more powerful than LP relaxations! Moreover, this has been a very fruitful area of research in the last 30 years, with many beautiful results (for those looking for a final project).

Quadratic Programming

A quadratic program (QP) is an optimization problem of the form $\begin{aligned} minimize & \frac{1}{2} x^{T} Q x + c^{T} x \\ subject to & q_{i} (x) \geq 0, i = 1, \dots, t \\ x \in R^{n} . \end{aligned}$ where $Q \in S^{n}$ is a symmetric matrix, and $q_{i} (x)$ are quadratic functions of $x$ .

An advantage of studying QPs is that they are a very expressive class of optimization problems (generalizing $0, 1$ -ILPs). However, a disatvantage is that they are NP-hard to solve in general.

Nevertheless, we can relax QPs to SDPs, and thus we can use the same template as we used for LP relaxations to obtain approximate solutions to QPs! And as we will see in this and the next lecture, SDP relaxations can be more powerful than LP relaxations.

Main Example: Max-Cut

The Max-Cut problem is defined as follows:

Max-Cut Problem: Given a graph $G = (V, E)$ , find a cut $(S, V ∖ S)$ that maximizes the number of edges between $S$ and $V ∖ S$ , i.e., the number of edges across the cut, which is denoted by $| E (S, V ∖ S) |$ .

While the minimum cut problem can be solved in polynomial time, the Max-Cut problem is NP-hard.

Last updated on Jun 13, 2024

Edit this page