Lecture 9 - Depth Reduction II

In the previous lecture, we saw Brent’s formula balancing technique, as well as Hyafil’s circuit balancing technique, which were used to reduce the depth of a circuit. While Brent’s technique reduced the depth of a formula while preserving the size of the formula (up to a constant factor), Hyafil’s technique reduced the depth of a circuit while incurring a quasi-polynomial blow-up in the size of the circuit.

In this lecture, we will see a different technique for depth reduction, which is due to Valiant, Skyum, Berkowitz, and Rackoff. They show that any arithmetic circuit of size $s$ computing a polynomial of degree $d$ can be converted into an arithmetic circuit of depth $O((\log s + \log d )\log d)$ and size $\text{poly}(s, d)$ that computes the same polynomial.

Before we state the theorem, recall that Hyafil’s technique seemed a bit coarse, as it didn’t take much into account the structure of the circuit. In particular, the technique didn’t take much into account the relations between intermediate computations in the circuit (the depth reduction had a “large gap” in degree).

We have seen in Baur-Strassen how to take derivatives with respect to a gate in a circuit. Let us recall that definition.

Definition 1 (Partial derivative with respect to a gate): Let $\Phi(\vec{x})$ be an algebraic circuit and $u, v$ be two gates in $\Phi$, computing polynomials $f_u, f_v$ respectively. Let $\Phi_{u = y}$ denote the circuit obtained by deleting the incoming edges to $u$ and replacing the output of $u$ by the variable $y$. Let $f_{v, u}(\vec{x}, y)$ be the polynomial computed by $v$ in the circuit $\Phi_{u = y}$. Now, define the partial derivative of $v$ with respect to $u$ as the polynomial $$\partial_u f_v := (\partial_y f_{v, u})|_{y = f_u}.$$

Now, we have the following properties of such gate derivatives, which are simply derived from the basic rules of differentiation.

Proposition 1 (properties of gate derivatives): Let $\Phi$ be a homogeneous algebraic circuit computing a polynomial $f$, and let $u, v$ be two gates in $\Phi$. Then, the following properties hold:

Either $\partial_u f_v = 0$ or $\deg(\partial_u f_v) = \deg(f_v) - \deg(f_u)$.
If $v$ is a sum gate, with children $v_1, v_2$, then $\partial_u f_v = \partial_u f_{v_1} + \partial_u f_{v_2}$.
If $v$ is a product gate, with children $v_1, v_2$, such that $\deg(f_{v_1}) \geq \deg(f_{v_2})$. If $\deg(f_{u}) > \deg(f_v)/2$, then $\partial_u f_v = f_{v_2} \cdot \partial_u f_{v_1}$.

Now that we have the notion of gate derivatives, we can refine our depth reduction. We will now define the notion of a frontier of a circuit.

Definition 2 (Frontier of a circuit): For an integer $r \geq 0$, the $r$-th frontier of a circuit $\Phi$, denoted by $\mathcal{F}_r(\Phi)$, is the set of all multiplication gates in $\Phi$ that compute a polynomial of degree larger than $r$, and whose children compute polynomials of degree less than or equal to $r$. That is $$\mathcal{F}_r(\Phi) := \{ v \in \Phi \mid f_v = f_{v_1} \cdot f_{v_2}, \quad \deg(f_v) > r, \quad \deg(f_{v_1}), \deg(f_{v_2}) \leq r \}.$$

With the above definition, we can get a refined decomposition of the circuit into layers, where each layer is a frontier of the circuit.

The following proposition shows the usefulness of the frontiers.

Proposition 2 (Frontier decomposition): Let $\Phi$ be a homogeneous algebraic circuit and $r > 0$ be an integer. Let $u,v$ be two gates in $\Phi$ such that $\deg(f_u) \leq r < \deg(f_v) < 2\deg(f_u)$. Then, we have $$ f_v = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_v \quad \text{ and } \quad \partial_u f_v = \sum_{w \in \mathcal{F}_r(\Phi)} \partial_u f_w \cdot \partial_w f_v.$$

Proof: Let’s prove the first equality by induction on the length of the longest path from $\mathcal{F}_r(\Phi)$ to $v$. Let $v_1, v_2$ be the children of $v$.

Base case: $v \in \mathcal{F}_r(\Phi)$.

In this case, for any $w \in \mathcal{F}_r(\Phi)$ different from $v$, we have $\partial_w f_v = 0$, as $w$ is not in $\Phi_v$. Since $\partial_v f_v = 1$, we have $$\sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_v = f_v \cdot \partial_v f_v = f_v.$$

Inductive step: we have two cases to consider.

Case 1: $v$ is a sum gate.

In this case, we have $f_v = f_{v_1} + f_{v_2}$, as well as $\deg(f_v) = \deg(f_{v_1}) = \deg(f_{v_2})$. Thus, by induction we have $$f_{v_1} = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_{v_1} \quad \text{ and } \quad f_{v_2} = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_{v_2}.$$ Hence, we have $$f_v = f_{v_1} + f_{v_2} = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot (\partial_w f_{v_1} + \partial_w f_{v_2}) = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_v.$$ where we have used Proposition 1 Part 2 in the last equality.

Case 2: $v$ is a product gate.

Assume w.l.o.g. that $\deg(f_{v_1}) \geq \deg(f_{v_2})$. Since $v \not\in \mathcal{F}_r(\Phi)$, we have $r < \deg(f_{v_1}) < 2r$. By induction, we have $$f_{v_1} = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_{v_1}.$$ For all $w \in \mathcal{F}_r(\Phi)$, we have $\deg(f_v) < 2r < 2 \cdot \deg(f_w)$. Thus, by Proposition 1 Part 3, we have $$\partial_w f_v = f_{v_2} \cdot \partial_w f_{v_1}.$$ Hence, we have $$f_v = f_{v_1} \cdot f_{v_2} = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot f_{v_2} \cdot \partial_w f_{v_1} = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_v.$$

The proof of the second equality is similar to the first one.

The point of the above proposition is to show that we can use the frontier to compute all the gates with degree between $r$ and $2r$ in the circuit. We are now ready to state and prove the Valiant-Skyum-Berkowitz-Rackoff theorem.

Valiant-Skyum-Berkowitz-Rackoff (VSBR) Theorem

Theorem 1 (Valiant-Skyum-Berkowitz-Rackoff): For any arithmetic circuit $\Phi$ of size $s$ computing a polynomial $f$ of degree $d$, there is an arithmetic circuit $\Psi$ of depth $O(\log(s) \log(d))$ and size $\text{poly}(s, d)$ that computes $f$.

The theorem above will follow if we prove the following version:

Theorem (variant of theorem 1): For any homogeneous arithmetic circuit $\Phi$ of size $s$ computing a polynomial $f$ of degree $d$, there is a homogeneous arithmetic circuit $\Psi$ computing $f$ having the following properties:

$\Psi$ has alternating layers of addition and multiplication gates.
Each multiplication gate $v$ computes a product of at most $5$ forms, each with degree at most $\deg(v)/2$.
Sum gates have arbitrary fan-in.
The size of $\Psi$ is $\text{poly}(s, d)$.

In particular, the above properties imply that the depth of $\Psi$ is $O(\log d)$.

Proof: Note that we can assume w.l.o.g. that $s \geq n$, as a circuit of size $< n$ cannot use all the variables. We will construct the circuit $\Psi$ iteratively, where in iteration $k$ we will do the following:

compute all the forms $f_v$ from $\Phi$ such that $2^{k-1} < \deg(f_v) \leq 2^k$.
compute all the forms $\partial_u f_v$ for all $u, v$ such that $2^{k-1} < \deg(f_v) - \deg(f_u) \leq 2^k$, and $\deg(f_v) < 2\deg(f_u)$.

Base case: In iteration $k = 0$, we compute all the forms $f_v$ such that $\deg(f_v) \leq 2^0 = 1$. As these are just linear forms in the input variables, we can compute each of them by an addition gate with fanin $\leq s$ and depth $1$. Also, for every two gates $u, v$ such that $\deg(f_v) - \deg(f_u) \leq 1$, Proposition 1 Part 1 implies that $\partial_u f_v = 0$ or $\deg(\partial_u f_v) = \deg(f_v) - \deg(f_u) \leq 1$. Hence, we can compute all the forms $\partial_u f_v$ by addition gates with fanin $\leq s$ and depth $1$.

Inductive step: Assume that we have completed iteration $k$. Let us compute iteration $k+1$. Let $v$ be a gate in $\Phi$ such that $2^k < \deg(f_v) \leq 2^{k+1}$ and let the frontier parameter be $r = \lfloor \deg(f_v)/2 \rfloor$. By Proposition 2, we have $$f_v = \sum_{w \in \mathcal{F}_r(\Phi)} f_w \cdot \partial_w f_v = \sum_{w \in \mathcal{F}_r(\Phi)} f_{w_1} \cdot f_{w_2} \cdot \partial_w f_v.$$ where $f_w = f_{w_1} \cdot f_{w_2}$ (since $w \in \mathcal{F}_r(\Phi)$ it must be a multiplication gate). As $w \in \mathcal{F}_r(\Phi)$, we have $r < \deg(f_w) \leq 2r$ and $\deg(f_{w_1}), \deg(f_{w_2}) \leq r$. Thus, by the inductive hypothesis, we have already computed $f_{w_1}, f_{w_2}$. Similarly, we have that $\deg(\partial_w f_v) \leq \deg(f_v) - \deg(f_w) < r$. Thus, we have already computed $\partial_w f_v$. Hence, we can compute $f_v$ by an addition gate with fanin $\leq 2 \cdot s$ whose children are product gates of $3$ forms each, each of degree at most $r$.

Now, let us compute the derivatives $\partial_u f_v$ such that $2^k < \deg(f_v) - \deg(f_u) \leq 2^{k+1}$ and $\deg(f_v) < 2\deg(f_u)$. Let the frontier parameter be $r = \lfloor \frac{\deg(f_v) + \deg(f_u)}{2} \rfloor$. By Proposition 2, we have $$\partial_u f_v = \sum_{w \in \mathcal{F}_r(\Phi)} \partial_u f_w \cdot \partial_w f_v.$$ As $w \in \mathcal{F}_r(\Phi)$ and we can assume $\partial_w f_v \neq 0$, we have $r < \deg(f_w) \leq \min{\deg(f_v), 2r}$ and $r \geq \deg(f_{w_1}) \geq \deg(f_{w_2})$. Moreover, we have $2 \deg(f_u) > \deg(f_v) \geq r$. Hence, by Proposition 1 Part 3, we have $\partial_u f_w = f_{w_2} \cdot \partial_u f_{w_1}$. By Proposition 1 Part 1, we have $\deg(\partial_w f_v) = \deg(f_v) - \deg(f_w) \leq \lfloor \frac{\deg(f_v) - \deg(f_u)}{2} \rfloor \leq 2^k$. Similarly, we have $\deg(\partial_u f_{w_1}) \leq \deg(f_{w_1}) - \deg(f_u) \leq \lfloor \frac{\deg(f_v) - \deg(f_u)}{2} \rfloor \leq 2^k$. Thus, by the inductive hypothesis, we have already computed $\partial_w f_v$ and $\partial_u f_{w_1}$.

The only problem now is that we need to compute $f_{w_2}$, and by the above we can only upper bound its degree by $\deg(f_v) - \deg(f_u)$. In this case, we simply expand $f_{w_2}$ as in Proposition 2, and we have $$\partial_u f_v = \sum_{w \in \mathcal{F}_r(\Phi)} f_{w_2} \cdot \partial_u f_{w_1} \cdot \partial_w f_v = \sum_{w \in \mathcal{F}_r(\Phi)} \left( \sum_{z \in \mathcal{F}_{m}(\Phi)} f_{z_1} \cdot f_{z_2} \cdot \partial_z f_{w_2} \right) \cdot \partial_u f_{w_1} \cdot \partial_w f_v.$$ where $m = \lfloor \frac{\deg(f_v) - \deg(f_u)}{2} \rfloor \leq 2^k$. Thus, by the inductive hypothesis, we have also computed $f_{z_1}, f_{z_2}$ and $\partial_z f_{w_2}$.

Now, all the terms in the multiplication have degree at most $m$, and we have at most $5$ terms in each product. This computation can be done by an addition gate with fanin $\leq 4 s^2$ whose inputs are product gates of $5$ forms each, each of degree at most $m$.

This concludes the theorem.

References

The material of this lecture is based on the following sources:

Chapter 2 of [SY].
Chapter 5 of [R]

Last updated on Jun 9, 2024

Edit this page