12. Set Disjointness II
At the beginning of the last lecture, we introduced the claim that the randomized communication complexity measure of the Set Disjointness function is
The optimal lower bound was first established by Kalyanasundaram and Schnitger (1992) and shortly afterwards an alternative proof was given by Razborov (1992). But we will cover a different proof of the same theorem that provides us with a great excuse to introduce a fundamental tool in communication complexity (and other areas of theoretical computer science): information theory.
Basics of Information Theory
The starting point for information theory is the notion of entropy.
When
The notion of entropy satisfies a number of useful properties. Perhaps the most fundamental is a direct consequence of Jensen’s inequality which gives a universal upper bound on the entropy of any random variable.
Proposition 1. For any random variable
with distribution over a finite set , the entropy of is abounded above by
When
where we write
Proposition 2. For any jointly distributed random variables
and ,
The mutual information between jointly distributed random variables
(There are many other equivalent formulations of mutual information as well.)
Informally, mutual information measures “how much information” about
We can also define conditional mutual information similarly.
Mutual information satisfies a number of useful properties. One of these particularly useful ones is the chaining rule.
Proposition 3. When
and are jointly distributed random variables,
And another useful property is that independent random variables do not affect conditional mutual information.
Proposition 4. When
are jointly distributed and and are independent under this joint distribution, then
Intuition for a Set Disjointness Lower Bound
We can use the information theory results introduced above to obtain a linear lower bound on the randomized communication complexity of Set Disjointness.
The starting point of this lower bound requires us to go back to a very basic question: why do we believe that the randomized communication complexity of the set disjointness function should be linear in
there is one very natural answer to this question: the communication complexity of disjointness should be
If such an inequality can be established, then we are done because
The problem with the intuition described abovbe is that inequalities like the conjectured one above do not hold in general.
Indeed, this intuition implicitly assumes that the best way to compute a function which can be represented as multiple instances of a smaller function must be to compute each of those instances separately to obtain the final result.
This is certainly not always the best course of action!
In fact, we can see how wrong this implicit assumption can be by considering a slight variant of the disjointness function where we replace the
At this point, it would be natural to assume we need to go back to the very beginning and find a different line of attack to try to prove a lower bound on disjointness.
But remarkably, using information theory we don’t need to do that: we can show that our original intuition on the disjointness function was in fact generally correct if we simply relax our target inequality slightly to replace
Let us see how we obtain such a result.
From Set Disjointness to AND
Fix a randomized protocol
The distribution
- Draw
uniformly at random. - Draw
uniformly at random. -
For each
, set - Set
to be the transcript of run on .
Let
We can use the information theory tools developed earlier to establish the following result.
Lemma 4. With
and defined as above and letting , we have
Proof. Let
represent a randomized communication protocol that computes with minimal worst-case cost. Then For any
, consider now the protocol for the AND function that proceeds as follows. On input , it generates as in the procedure for and then overwrites and . Then Alice and Bob simulate the protocol on the resulting input . Note that for any , we will have by construction, and so the outcome of determines the value of . Furthermore, if is drawn from the distribution with the protocol , then we have and by definition in the lemma statement this value is bounded below by
Note that we are not quite done at this point.
Earlier, we obtained the trivial lower bound of
We will see this lower bound in the next lecture.