9. Communication Complexity
Let us now move on from query complexity to consider a different model of computation in which randomness plays an important role: communication complexity. In the next few lectures, we broadly follow the presentation of Rao and Yehudayoff (2020)—see the book for the details and much more about communication complexity.
The Model
Communication complexity is concerned with the minimum amount of communication that must be exchanged between various parties to compute a given function on their joint inputs.
We focus on the two-party model of communication.
In this model, there are two parties that each obtain part of the input.
By convention, the two parties are named Alice and Bob, and we often use
A deterministic communication protocol that Alice and Bob use to compute the value
-
It can be viewed as an interactive procedure. At every step of the procedure, Alice sends a single bit to Bob or Bob sends a single bit to Alice. The procedure specifies which of these two options occurs at each step, and also what inputs cause the sender to communicate a 1 to the other player. The procedure terminates when no more messages need to be exchanged between the players and they both know the output value of the protocol. The list of bits that are transmitted during the execution of the protocol is called the transcript.
-
A communication protocol can also be viewed as a binary tree. Each internal node of the tree specifies which party sends the next bit and the set of inputs for which that party outputs 1. The leaves of the tree are labelled with the output value of the protocol.
-
Finally, a communication protocol can be viewed as a partition of the matrix with rows and columns indexed by the set of strings in
into combinatorial rectangles. Recall that a combinatorial rectangle (or just rectangle from now on) in the matrix index by is a subset of the matrix for some . Each rectangle corresponds to the set of inptus that reach a given leaf of the tree representation of the protocol.
We will often use
A randomized communication protocol with public randomness is a deterministic protocol defined over Alice and Bob’s input and a random string
Throughout this and the next few lectures, we use
Upper Bounds
Let’s familiarize ourselves with the communication complexity model by working through a few examples.
Equality
We have already seen one example in the very first lecture: the equality function
Theorem.
.
We have seen that a valid protocol for this function involves Alice and Bob exchanging the inner products
Another way to view the same protocol is that Alice and Bob treat the random string
Greater-Than
A natural variant of the equality function is the greater-than function
The greater-than function also has low communication complexity.
Theorem.
.
Proof. We can obtain a protocol for
by using the protocol for equality as a subroutine. As a first step, we can use the protocol to determine if ; if so, the output value is . But more interestingly, we can then have Alice and Bob run the protocol for equality on their first bits each. If the strings and are equal, then we can ignore them and continue recursively on the last bits of both inputs. Otherwise, the last bits of each inputs don’t matter; we can continue recursively on the first half of Alice and Bob’s inputs. Continuing in this way, if each call to equality returns the correct answer, we reach the smallest index for which ; by exchaning this bit, Alice and Bob obtain the value of . The one catch about the argument above is the assumption that every call to the procedure that computes equality returns the correct value. In general, this is not true: every call errs with constant probability. However, with success amplification we can can drive down the error probability to less than
, say, and apply the union bound to argue that except with probability at most all calls to the equality subroutine return the correct value. The recursion calls equality subroutines at
levels, and the success amplification can be accomplished with an additional multiplicative factor overhead, giving the bound claimed in the theorem.
We can do even better than the argument above:
k-Disjointness
Another fundamental function in the study of communication complexity is the disjointness function, in which we interpret
The
By making full use of the idea of treating the random string as a possible mask on the inputs, Johan Håstad and Avi Wigderson (2007) obtained a surprisingly efficient randomized communication protocol for the
Theorem.
Proof. Let us denote Alice and Bob’s inputs as the sets
. Partition the random string into a sequence of strings of length each. Alice can find the smallest index for which the set satisfies . Bob updates his set to be . Then Bob sends the smallest index for which , and Alice updates her set to be . Alice and Bob then repeat this process iteratively, continuing to shrink their sets in each round. If either Alice or Bob ends up with an empty set, the protocol ends and outputs 1. Otherwise, if rounds have been completed for a large enough constant and both players still have a non-empty set, the protocol ends with the value 0. Let’s examine the correctness of the protocol first. When
, then there is some index that is never eliminated, so the protocol always correctly outputs 0. Conversely, when , we can use standard Chernoff-Hoeffding bounds to show that with high probability, a constant fraction of the remaining elements in each set is removed in each round. Therefore, with high probability (that depends on the choice of the constant ), the algorithm correctly outputs 1. It remains to bound the cost of this protocol. Note that the probability that a given random string
satisfies the condition that is exactly : each of the coordinates in is included in with probability and all those events are independent. So the probability that none of the first random strings satisfy the first condition is at most . This means that with high probability Alice’s first message requires at most bits. And with a naive argument, we could show that the total communication between Alice and Bob is therefore
, since there are rounds in the procedure. But we can eliminate the logarithmic factor by noting that the sets get reduced by a constant factor with high probability at each round. This means that with a more careful analysis, we see that the total communication of the protocol forms a geometric series and is thus bounded by .
The bound is tight, as we will see in later lectures by showing that the general disjointness function has communication complexity
Inner Product
One more example: the inner product function
Can you find a randomized communication protocol that computes this function with
We’ll return to this question during the next lecture.