Lecture 1: Computational Models & Complexity Classes
Today we will start our journey into the world of Algebraic Complexity Theory. We will start by discussing the basic computational models and complexity classes that we will be using throughout the course, as well as some models that we will not be using, but are important to know about.
In classical complexity theory, there are two main models of computation: the uniform model (Turing machines) and the non-uniform model (Boolean circuits).
- The uniform model encapsules the idea that a program of fixed size can solve problems of arbitrary size (that is, the size of the program does not depend on the size of the input). This is what we are used to when we think of designing/programming an algorithm.
- The non-uniform model encapsules the idea of “infinite sized programs” where now, for every input size $n$, we have a different program $C_n$ that solves problems of size $n$. In this case the description of our “program” will be given by the family of programs $\{C_n\}_{n \in \mathbb{N}}$. Even though this model might seem strange at first, note that this model captures the notions of circuits (such as GPUs or VLSI circuits).
Note that each program (either in uniform or non-uniform model) can be thought of as a function that maps inputs to outputs (we are only thinking of deterministic algorithms for now). So we can associate to each deterministic program the function it computes, which is a boolean function in the case of decision problems.
Both models above have their analogues in the algebraic world. However, in the uniform model, the notion of algebraic computation also deals with analytic problems and is much more general than what we will be discussing in this course. For those interested, please see the book [BCSS] in the resources page.
In the non-uniform model, we have the notion of algebraic circuits, which are the algebraic analogue of Boolean circuits. Whereas Boolean circuits compute boolean functions, algebraic circuits will compute algebraic functions (polynomials or rational functions) over some field $\mathbb{F}$. Since rational functions (in the commutative setting) are simply the ratio of two polynomials, we can simply focus on the computation of polynomials.
Hence, by an “algebraic function” we will denote a family of polynomials $\{f_n\}_{n \in \mathbb{N}}$ where $f_n \in \mathbb{F}[x_1, \ldots, x_n]$. In this case each $f_n$ can be thought of as a polynomial function from $\overline{\mathbb{F}}^n$ to $\overline{\mathbb{F}}$.
In this course, we will be focusing on the non-uniform model, and in particular, on algebraic circuits (or straight-line programs), which we now formally define.
Algebraic Circuits, Formulas, and Branching Programs
Definition 1 (Algebraic Circuit/Straight-Line Program): An algebraic circuit $\Phi$ is a directed acyclic graph (DAG) with the following properties:
- Each node is either an input node, a constant node, or a gate node.
- Each gate node is labeled by an algebraic operation: $+$, $-$, $\times$ and $\div$.
- Each gate node has in-degree 2.
- Each input or constant node has in-degree 0.
Each gate of a circuit computes a rational function in the natural way. For instance, if $g$ is a gate labeled by $+$, then the rational function computed by $g$ is the sum of the rational functions computed by the two children of $g$. Similarly the above holds for the other operations (with the caveat that we should never divide by the zero rational function).
We say that a circuit $\Phi$ computes a rational function (or a polynomial) $f$ if one of the gates of $\Phi$ computes $f$.
Remark 1: if the underlying undirected graph of the circuit is a tree, that is, the out-degree of each node is at most 1, then the circuit is called a formula.
Example 1: The following is an example of two distinct algebraic circuits computing $f(x, y) = x^2 - y^2$
$g_1 = x, g_2 = y, g_3 = g_1 \times g_1, g_4 = g_2 \times g_2, g_5 = g_3 - g_4$
$h_1 = x, h_2 = y, h_3 = h_1 + h_2, h_4 = h_1 - h_2, h_5 = h_3 \times h_4$
In the above examples, the gates $g_5$ and $h_5$ compute the same polynomial $f(x, y) = x^2 - y^2$.
Note that the computation done by the $g$ gates yields a circuit whereas the computation done by the $h$ gates yields a formula.
Definition 2 (Algebraic Branching Programs): An algebraic branching program (ABP) is a directed graph with the following properties:
- there are two distinguished nodes, the source node and the sink node.
- all nodes are organized in layers, where the source node is the only node in the first layer and the sink node is the only node in the last layer.
- each edge is directed from a node in layer $i$ to a node in layer $i+1$, and is labeled by an affine polynomial in the variables $x_1, \ldots, x_n$.
- given any path from source to sink, the polynomial computed by this path is the product of the polynomials on the edges of the path.
- the polynomial computed by the branching program is the sum of the polynomials computed by all paths from source to sink.
Having defined the computational models that we will study, we can now define the complexity measures of interest.
Complexity Measures and Complexity Classes
Complexity Measures
- The size of a circuit, denoted $\mathrm{ckt-size}(\Phi)$, is the number of gates in the circuit.
- The depth of a circuit, denote $\mathrm{depth}(\Phi)$, is the length of the longest path from an input node to an output node.
- The degree of a circuit, denoted $\mathrm{ckt-deg}(\Phi)$, is the maximum formal degree of any gate in the circuit, where the formal degree is defined inductively as follows:
- The formal degree of an input node is 1.
- The formal degree of a constant node is 0.
- The formal degree of a gate labeled by $+$ or $-$ is the maximum of the formal degrees of its children.
- The formal degree of a gate labeled by $\times$ (or $\div$) is the sum (or the difference) of the formal degrees of its children.
In the case of formulas, the definitions above are the same.
For branching programs, the size of a branching program is the number of nodes in the program, and the depth of a branching program is the number of layers in the program. Another important measure for branching programs is the width of a branching program, which is the maximum number of nodes in any layer of the program.
Now that we have defined the basic computational model and complexity measures, we can move on to the complexity classes that we will be studying in this course.
Complexity Classes
Definition 3 (p-bounded families): A family of polynomials $\{f_n\}_{n \in \mathbb{N}}$ is said to be p-bounded if there exists a polynomial $t : \mathbb{N} \to \mathbb{N}$ such that $f_n \in \mathbb{F}[x_1, \ldots, x_{t(n)}]$ and $\mathrm{deg}(f_n) \leq t(n)$ for all $n$.
We are now ready to define the algebraic non-uniform analogue of the complexity class $\mathrm{P}$. This definition was given by Valiant, who named this class as the class of p-computable families of polynomials. Nowadays, this class is known as the class $\mathrm{VP}$. This is the class of families of polynomials of polynomial degree that can be computed by algebraic circuits of polynomial size.
Definition 4 (VP): The class $\mathrm{VP}_{\mathbb{F}}$ is the class of all p-bounded families of polynomials $\{f_n\}_{n \in \mathbb{N}}$ over $\mathbb{F}$ such that there exists a polynomial $t : \mathbb{N} \to \mathbb{N}$ and a family of algebraic circuits $\{\Phi_n\}_{n \in \mathbb{N}}$ such that $\Phi_n$ computes $f_n$ and $\mathrm{ckt-size}(\Phi_n) \leq t(n)$ for all $n$.
A non-example: the family of polynomials $\{x^{2^n} \}_{n \in \mathbb{N}}$ is not in $\mathrm{VP}$, since the degree of the polynomials in the family grows exponentially, even though this family can be computed by a polynomial size circuit.
Remark 2: The class $\mathrm{VP}_{e, \mathbb{F}}$ is the class of all p-bounded families of polynomials $\{f_n\}_{n \in \mathbb{N}}$ over $\mathbb{F}$ that can be computed by algebraic formulas of size at most $t(n)$. The $e$ in the subscript stands for “expressible” - that is, the class of families of polynomials that can be expressed by a formula of polynomial size.
We are now ready to define the class $\mathrm{VNP}$, which is the algebraic analogue of the class $\mathrm{NP}$. Valiant named this class as the class of p-definable families of polynomials.
Definition 5 (VNP): A family of polynomials $\{f_n\}_{n \in \mathbb{N}}$ is said to be p-definable if there are polynomials $t, s : \mathbb{N} \to \mathbb{N}$ and a family $\{ g_n \}_{n \in \mathbb{N}} \in \mathrm{VP}_{\mathbb{F}}$ such that for every $n$, $$ f_n(x_1, \dots, x_{t(n)}) = \sum_{y \in {0,1}^{s(n)}} g_n(x_1, \ldots, x_{t(n)}, y_1, \dots, y_{s(n)})$$ The class $\mathrm{VNP}_{\mathbb{F}}$ is the class of all p-definable families of polynomials.
From the definitions above, one sees that for any base field $\mathbb{F}$, that $\mathrm{VP}_{\mathbb{F}} \subseteq \mathrm{VNP}_{\mathbb{F}}$.
After defining the classes $\mathrm{VP}_{\mathbb{F}}$ and $\mathrm{VNP}_{\mathbb{F}}$, Valiant made the following conjecture:
Valiant’s Hypothesis I: $\mathrm{VP}_{\mathbb{F}} \neq \mathrm{VNP}_{\mathbb{F}}$.
Just as the theory in the boolean complexity case has evolved, we would like to classify families of polynomials in terms of their complexity. Now that we have our main classes defined, we can define the concept of reduction between polynomial families. This is captured by the concept of projection, which simply says that a polynomial $f$ is “easier” than another polynomial $g$ if $f$ can be obtained by substituting variables in $g$.
Definition 6 (Projection): A polynomial $f(x_1, \ldots, x_n)$ is a projection of a polynomial $g(y_1, \ldots, y_m)$ if there exists an assignment $$\sigma : {y_1, \ldots, y_m} \to \mathbb{F} \cup {x_1, \ldots, x_n}$$ such that $$f(x_1, \ldots, x_n) = g(\sigma(y_1), \ldots, \sigma(y_m)).$$
A family of polynomials $\{f_n\}_{n \in \mathbb{N}}$ is a p-projection of a family of polynomials $\{g_n\}_{n \in \mathbb{N}}$ if there is a polynomial $t : \mathbb{N} \to \mathbb{N}$ such that for every $n$, there is $\alpha(n) \leq t(n)$ such that $f_n$ is a projection of $g_{\alpha(n)}$.
Important Families of Polynomials
We will now discuss some important families of polynomials that will be of interest in this course, and are of general interest more broadly in other areas of mathematics and sciences.
Power-Sum Polynomials
The power-sum polynomials are defined as follows: $$PS_d(x_1, \ldots, x_n) = \sum_{i=1}^n x_i^d.$$
This family of polynomials appears in the study of Waring rank, or equivalently symmetric tensors, and has connections to the complexity of matrix multiplication, for instance.
Elementary Symmetric Polynomials
The elementary symmetric polynomials are defined as follows: $$ES_d(x_1, \ldots, x_n) = \sum_{1 \leq i_1 < i_2 < \ldots < i_d \leq n} x_{i_1} x_{i_2} \cdots x_{i_d}.$$
Iterated Matrix Multiplication
Let $X^{(1)}, \ldots, X^{(d)}$ be $n \times n$ symbolic matrices. The iterated matrix multiplication polynomial is defined as follows: $$\mathrm{IMM}_d(X^{(1)}, \ldots, X^{(d)}) := Tr(X^{(1)} X^{(2)} \cdots X^{(d)}) = \sum_{i_1, \ldots, i_d} X^{(1)}_{i_1, i_2} X^{(2)}_{i_2, i_3} \cdots X^{(d)}_{i_d, i_1}.$$
From the definition of $\mathrm{IMM}$, we can see that the family of iterated matrix multiplication polynomials is “complete” for the class of algebraic branching programs.
Determinant
Given a symbolic $n \times n$ matrix $X$, the determinant polynomial is defined as follows: $$\mathrm{Det}_n(X) = \sum_{\sigma \in S_n} \mathrm{sgn}(\sigma) \prod_{i=1}^n X_{i, \sigma(i)}.$$
The determinant is the central object in linear algebra, and appears in many areas of mathematics and computer science.
Another way to define the determinant, which will be useful for us, is by defining it in terms of cycle covers of the complete directed graph (with self loops), which are in bijection with permutations.
A cycle cover of a graph $G = ([n], E)$ is a set of cycles $C_1, \ldots, C_r$ such that each vertex of $G$ is in exactly one cycle. The weight of a cycle cover is the product of the weights of the edges in the cycles.
Thus, we have the following formula: $$\mathrm{Det}_n(X) = \sum_{C \in \mathcal{CC}} \text{sign}(C) \cdot \text{weight}(C),$$ where $\mathcal{CC}$ is the set of cycle covers of the complete directed graph $\vec{K_{n}}$ (with self-loops).
Permanent
Given a symbolic $n \times n$ matrix $X$, the permanent polynomial is defined as follows: $$\mathrm{Perm}_n(X) = \sum_{\sigma \in S_n} \prod_{i=1}^n X_{i, \sigma(i)}.$$
The permanent polynomial is a central object in combinatorics and computer science, characterizing the complexity of counting problems (as was proven by Valiant at around the same time as he made the definitions above).
Similarly to the determinant, the permanent can be defined in terms of cycle covers of the complete bipartite graph (with self loops).
Thus, we have the following formula: $$\mathrm{Perm}_n(X) = \sum_{C \in \mathcal{CC}} \text{weight}(C),$$ where $\mathcal{CC}$ is the set of cycle covers of the complete directed graph $\vec{K_{n}}$ (with self-loops).
There are many other interesting families of polynomials which are very important in several other areas of mathematics, which we do not list here. For instance immanants (which generalize both the determinant and the permanent), Schur polynomials, multivariate matching polynomials, among others…