Linear Programming & Duality Theorems

Mathematical Programming deals with problems of the form

$min f (x)$ $subject to g_{1} (x) \leq 0, g_{2} (x) \leq 0, \dots, g_{m} (x) \leq 0, x \in R^{n}$

If we do not impose any constraints in the functions $f$ and $g_{i}$ , the above is a very general family of problems, which includes for instance many NP-hard problems, such as quadratic programming, integer programming, etc.

In this lecture we will focus on a particular case of mathematical programming, called Linear Programming, where the functions $f$ and $g_{i}$ are affine linear functions. In particular, in this lecture we will learn about Linear Programming and its strong duality theorem.

Traces of the idea of linear programming can be found in the works of Fourier, and linear programming was first formally studied in the works of Kantorovich, Koopmans, Dantzig, and von Neumann in the 1940s and 1950s.

Linear Programming

An affine linear function $f : R^{n} \to R$ is a function of the form $f (x) = c^{T} x + b = c_{1} x_{1} + c_{2} x_{2} + \dots + c_{n} x_{n} + b$ where $c \in R^{n}$ and $b \in R$ .

A linear program is a mathematical programming problem of the form $min c^{T} x$ $subject to A x \leq b, x \in R^{n}$ where $A \in R^{m \times n}$ and $b \in R^{m}$ .

Given any linear program, we define the feasible region as the set of points that satisfy the constraints of the linear program, i.e., for the above linear program we define the feasible region as $Feasible region := x \in R^{n} : A x \leq b .$

One important property of linear programs is that the feasible region is always a convex set. A convex set $K$ is a set such that for any two points $x, y \in K$ , the line segment connecting $x$ and $y$ is also in $K$ . In other words, for any $x, y \in K$ and any $λ \in [0, 1]$ , we have that the convex combination $λ x + (1 - λ) y \in K$ .

Since the feasible region is defined by a finite set of linear inequalities, we also know that it is a convex polytope.

As convex combinations will be important for us, let us define them more formally. Given a set of points $x_{1}, x_{2}, \dots, x_{k} \in R^{n}$ , a convex combination of these points is a point of the form $λ_{1} x_{1} + λ_{2} x_{2} + \dots + λ_{k} x_{k}$ where $λ_{1}, λ_{2}, \dots, λ_{k} \geq 0$ and $λ_{1} + λ_{2} + \dots + λ_{k} = 1$ .

Standard Form

We can always represent a linear program in the following standard form: $min c^{T} x$ $subject to A x = b, x \geq 0, x \in R^{n}$ where $A \in R^{m \times n}$ , $b \in R^{m}$ , $c \in R^{n}$ , and we say that $x \geq 0$ if $x_{i} \geq 0$ for all $i = 1, 2, \dots, n$ .

Important Questions

Given a linear program, which we assume is in standard form, we are interested in answering the following questions:

When is a linear program feasible (i.e., is there a solution to the constraints)?
When is a linear program bounded (i.e., is there a minimum value to the objective we are trying to minimize)?
Can we characterize the optimal solutions to a linear program?

3.1 How do we know if a solution is optimal?

3.2. Do the optimal solutions have a nice description?

3.3. Do the optimal solutions have small bit complexity?
Can we efficiently solve a linear program?

Structure of Linear Programs

To address the questions above, we will first study the structure of linear programs.

A first observation is that the feasible region of a linear program is a convex polytope, as it is the intersection of a finite number of half-spaces.

We are now ready to state the fundamental theorem of linear inequalities, proved by Farkas (1894, 1898) and Minkowski (1896).

Theorem 1 (Fundamental Theorem of Linear Inequalities): Let $a_{1}, a_{2}, \dots, a_{m}, b \in R^{n}$ , and let $r := rank a_{1}, a_{2}, \dots, a_{m}, b$ . Then, exactly one of the following holds:

$b$ is a non-negative linear combination of $a_{1}, a_{2}, \dots, a_{m}$ .
There exists a hyperplane $H := x \in R^{n} : c^{T} x = 0$ such that

2.1. $b$ is in the half-space $H^{+} := x \in R^{n} : c^{T} x > 0$ .

2.2. $a_{1}, a_{2}, \dots, a_{m}$ are in the half-space $H^{-} := x \in R^{n} : c^{T} x \leq 0$ .

2.3. $H$ contains $r - 1$ linearly independent vectors from $a_{1}, a_{2}, \dots, a_{m}$ .

Translating to the affine setting, if one takes vectors $\vec{a_{1}}, \vec{a_{2}}, \dots, \vec{a_{m}}$ and a vector $\vec{b}$ , given by $\vec{a_{i}} = (\begin{matrix} 1 \\ a_{i} \end{matrix})$ and $\vec{b} = (\begin{matrix} 1 \\ b \end{matrix})$ , the theorem states that exactly one of the following holds:

$\vec{b}$ is a convex combination of $\vec{a_{1}}, \vec{a_{2}}, \dots, \vec{a_{m}}$ .
There exists a hyperplane $H := x \in R^{n + 1} : c^{T} x = 0$ such that

2.1. $\vec{b}$ is in the half-space $H^{+} := x \in R^{n + 1} : c^{T} x > 0$ .

2.2. $\vec{a_{1}}, \vec{a_{2}}, \dots, \vec{a_{m}}$ are in the half-space $H^{-} := x \in R^{n + 1} : c^{T} x \leq 0$ .

2.3. $H$ contains $r - 1$ linearly independent vectors from $\vec{a_{1}}, \vec{a_{2}}, \dots, \vec{a_{m}}$ .

One can see the above follows from the fundamental theorem as any non-negative linear combination of $\vec{a_{1}}, \vec{a_{2}}, \dots, \vec{a_{m}}$ giving $\vec{b}$ must be a convex combination of $\vec{a_{1}}, \vec{a_{2}}, \dots, \vec{a_{m}}$ (due to the first coordinate being $1$ for all these vectors).

Remark 1: Any hyperplane $H$ which separates $\vec{b}$ from $\vec{a_{1}}, \vec{a_{2}}, \dots, \vec{a_{m}}$ is called a separating hyperplane.

Farkas’ Lemma

Lemma 1 (Farkas’ Lemma): Let $A \in R^{m \times n}$ and $b \in R^{m}$ . The following two statements are equivalent:

There exists $x \in R^{n}$ such that $A x = b$ and $x \geq 0$ .
For all $y \in R^{m}$ , if $y^{T} A \geq 0$ , then $y^{T} b \geq 0$ .

There are two equivalent formulations of Farkas’ Lemma, which will be useful for us.

Lemma 2 (Farkas’ Lemma - variant 1): Let $A \in R^{m \times n}$ and $b \in R^{m}$ . Then exactly one of the following holds:

There exists $x \in R^{n}$ such that $A x = b$ and $x \geq 0$ .
There exists $y \in R^{m}$ such that $y^{T} b > 0$ and $y^{T} A \leq 0$ .

Lemma 3 (Farkas’ Lemma - variant 2): Let $A \in R^{m \times n}$ and $b \in R^{m}$ . The following two statements are equivalent:

There exists $x \in R^{n}$ such that $A x \leq b$
For all $y \in R^{m}$ such that $y^{T} A = 0$ and $y \geq 0$ , we have $y^{T} b \geq 0$ .

Proof of Lemma 3: Let $M = [I A - A]$ . Then $A x \leq b$ has a solution if and only if $M z = b$ has a solution where $z \geq 0$ . By Farkas’ Lemma (Lemma 1), this is equivalent to the statement that for all $y \in R^{m}$ such that $y^{T} M \geq 0$ , we have $y^{T} b \geq 0$ . Since $y^{T} M = [y^{T} y^{T} A - y^{T} A]$ , this is equivalent to the statement that for all $y \in R^{m}$ such that $y^{T} A = 0$ and $y \geq 0$ , we have $y^{T} b \geq 0$ .

Duality Theory

Given a linear program in standard form $min c^{T} x$ $subject to A x = b, x \geq 0, x \in R^{n}$ from Farkas’ Lemma, we know that the feasible region is non-empty if and only for any $y \in R^{m}$ such that $y^{T} A \geq 0$ , we have $y^{T} b \geq 0$ .

If we look at what happens when we multiply the constraints by $y^{T}$ , we note the following: $y^{T} A \leq c^{T} \Rightarrow y^{T} A x \leq c^{T} x \Rightarrow y^{T} b \leq c^{T} x .$

Thus, if we can find a $y$ such that $y^{T} A \leq c^{T}$ , then we have that $y^{T} b \leq c^{T} x$ for all feasible $x$ . Thus, $y^{T} b$ is a lower bound on the optimal value of the linear program.

This motivates the following definition.

Definition 1 (Dual Linear Program): The dual linear program of a linear program in standard form $min c^{T} x$ $subject to A x = b, x \geq 0, x \in R^{n}$ is the linear program $max y^{T} b$ $subject to y^{T} A \leq c^{T}, y \in R^{m} .$

Practice Problem: prove that the dual of the dual linear program is the primal linear program.

By the above discussion we have proved that the optimal value of the dual linear program is a lower bound on the optimal value of the primal linear program. This is the content of the following theorem, known as the Weak Duality Theorem.

Theorem 2 (Weak Duality Theorem): Let $x$ be a feasible solution to the primal linear program and let $y$ be a feasible solution to the dual linear program. Then $c^{T} x \geq y^{T} b$ .

Let $α$ be the optimal value of the primal linear program and let $β$ be the optimal value of the dual linear program. The Weak Duality Theorem states that $α \geq β$ . Moreover, if the primal problem is unbounded, i.e. $α = - \infty$ , then the dual problem is infeasible, i.e. $β = - \infty$ . Similarly, if the dual problem is unbounded, i.e. $β = \infty$ , then the primal problem is infeasible, i.e. $α = \infty$ .

Now it is natural to ask whether the inequality $α \geq β$ can be strict, or if it is actually an equality. The answer to this question is given by the Strong Duality Theorem, which states that under feasibility conditions, the optimal values of the primal and dual linear programs are always equal!

Theorem 3 (Strong Duality Theorem): If the primal and dual linear programs have feasible solutions, then the optimal values of the primal and dual linear programs are equal, i.e., $α = β$ .

Proof of Strong Duality Theorem: Since we have proved weak duality, to prove that $α = β$ it suffices to show that the following LP has a feasible solution: $max 0$ $subject to c^{T} x - y^{T} b \leq 0, A x = b, x \geq 0, y^{T} A \leq c^{T}, y \in R^{m} .$ To show that this program has a feasible solution, given that the primal and dual linear programs have feasible solutions, we can use variant 2 of Farkas’ Lemma (Lemma 3).

The above LP can be encoded in matrix form as $max 0$ $subject to (\begin{matrix} c^{T} & - b^{T} \\ A & 0 \\ - A & 0 \\ 0 & A^{T} \\ - I & 0 \end{matrix}) (\begin{matrix} x \\ y \end{matrix}) \leq (\begin{matrix} 0 \\ b \\ - b \\ c \\ 0 \end{matrix}) .$

Variant 2 of Farkas’ Lemma (Lemma 3) states that this LP has a feasible solution if and only if for all $z \in R^{1 + 2 (n + m)}$ such that $z^{T} B = 0$ and $z \geq 0$ , we have $z^{T} (\begin{matrix} 0 \\ b \\ - b \\ c \\ 0 \end{matrix}) \geq 0$ .

Let $z^{T} = (\begin{matrix} λ & u^{T} & v^{T} & w^{T} & e^{T} \end{matrix})$ . Then the above inequality is equivalent to the following: $z^{T} B = 0 and z \geq 0 \Rightarrow u^{T} b - v^{T} b + w^{T} c \geq 0.$ We have two cases to consider:

$λ > 0$ : In this case, from $z^{T} B = 0$ we get the following equations: $λ c^{T} + u^{T} A - v^{T} A - e^{T} = 0, - λ b + A w = 0.$ From the first equation we get that $u^{T} A - v^{T} A = - λ c^{T} + e^{T} \geq - λ c^{T}$ , as $e^{T} \geq 0$ . Since $w \geq 0$ , we have that $(u^{T} - v^{T}) A w \geq - λ c^{T} w$ which when combined with the second equation gives $λ (u^{T} - v^{T}) b \geq - λ c^{T} w \Rightarrow u^{T} b - v^{T} b + w^{T} c \geq 0.$ where the last inequality follows from the fact that $λ > 0$ .
$λ = 0$ : In this case, from $z^{T} B = 0$ we get the following equations: $u^{T} A - v^{T} A - e^{T} = 0, A w = 0.$

Let $x, y$ be feasible solutions to the primal and dual linear programs, respectively (which exist by assumption). Then $x \geq 0$ , $A x = b$ , and $y^{T} A \leq c^{T}$ .

Thus, from $w \geq 0$ and from the second equality above we have that $c^{T} w \geq y^{T} A w = 0$ . Moreover, from the first equality above and from $A x = b, x \geq 0$ we have that $(u^{T} - v^{T}) A = e^{T} \geq 0, x \geq 0 \Rightarrow (u^{T} - v^{T}) A x \geq 0 \Rightarrow u^{T} b - v^{T} b \geq 0.$ Thus, we have that $u^{T} b - v^{T} b + w^{T} c \geq 0$ .

In both cases we have that $u^{T} b - v^{T} b + w^{T} c \geq 0$ , which completes the proof of the Strong Duality Theorem.

Farkas’ Lemma - Affine Form

A consequence of Strong Duality is the following affine form of Farkas’ Lemma.

Lemma 4 (Farkas’ Lemma - Affine Form): Let $A \in R^{m \times n}$ and $b \in R^{m}$ . Let the system $A x \leq b$ be feasible, and suppose that inequality $c^{T} x \leq δ$ holds whenever $x$ satisfies $A x \leq b$ . Then there exists $δ^{'} \leq δ$ such that the linear inequality $c^{T} x \leq δ^{'}$ is a non-negative linear combination of the inequalities in the system $A x \leq b$ .

Practice Problem: use LP duality and Farkas’ Lemma to prove the above lemma.

Complementary Slackness

If both primal and dual linear programs have feasible solutions, and if $x$ is a feasible solution to the primal linear program and $y$ is a feasible solution to the dual linear program, then the following conditions are equivalent:

$x$ is optimal for the primal linear program and $y$ is optimal for the dual linear program.
$c^{T} x = y^{T} b$ .
For all $i \in [n]$ , if $x_{i} > 0$ , then the $i$ -th inequality in $y^{T} A \leq c^{T}$ is tight at $y$ . That is, $y^{T} A_{i} = c_{i}$ .

Note that 1 and 2 are equivalent by the Strong Duality Theorem, and 2 and 3 are equivalent by the following equation: $c^{T} x - y^{T} b = c^{T} x - y^{T} A x = (c^{T} - y^{T} A) x = \sum_{i = 1}^{n} (c_{i} - y^{T} A_{i}) x_{i}$

Conclusion

In this lecture, we have learned about mathematical programming, its generality, and we have studied the structure of a particular case of mathematical programming, called Linear Programming.

As we will see in the next lecture, Linear Programming is a very powerful tool not only for optimization, but the duality theory of Linear Programming has many applications in computer science, economics, and other areas, as we will see in the next lecture.

References

This lecture was prepared based on the following references:

Schrijver, A. (1986). Theory of Linear and Integer Programming.

Last updated on Jun 9, 2024

Edit this page