Lecture 5 - Universal Circuits & Computing First-Order Partial Derivatives
Universal Circuits
In the same way that in Boolean complexity theory, we have the notion of universal Turing machines, in algebraic complexity theory, we have the notion of universal circuits. Universal circuits are circuits that can simulate any circuit of a given size. We now give a formal definition of universal circuits.
Definition 1 (Universal Circuits): A circuit
For any
In other words,
By the results above on efficient homogenization, it is enough to construct universal circuits for homogeneous circuits computing forms.
In order prove the existence of efficient universal circuits, it is good to first put a bit more structure on the circuits we are considering. This will be done by considering circuits in normal-homogeneous form.
Definition 2 (Normal-Homogeneous Form): A homogeneous circuit
- All inputs are labeled by a variable
- All edges leaving an input gate are connected to sum gates
- Output gates are sum gates
- Non-input gates are alternating: that is, a product gate is connected to a sum gate, and a sum gate is connected to a product gate.
- The fan-in of each product gate is exactly
. (we do not restrict fan-in of sum gates) - The fan-out of each addition gate is at most
.
We will first show that any homogeneous circuit can be efficiently transformed into a circuit in normal-homogeneous form.
Lemma 1: For any homogeneous circuit
We are now ready to prove that we can efficiently construct universal circuits.
Theorem 3 (Universal Circuits): For any integers
Proof: We
Computing First-Order Partial Derivatives
We will now prove the following seminal result in algebraic complexity theory, due to Baur and Strassen, which is also known as backpropagation in Machine Learning.
Theorem 4 (Baur-Strassen): Let
The main idea is to note that the chain rule for differentiation can be efficiently implemented in algebraic circuits, since each gate in the circuit has fanin at most
Proof: We prove this theorem by induction on the size of the circuit computing
Base case: If
Inductive step: Suppose that the theorem holds for all circuits of size at most