2. Definitions
Our goal throughout the rest of these lectures will be to understand better when randomness helps (and perhaps why) with computational efficiency in various models of computation like the ones above. To do so, we need to introduce some definitions.
But an important challenge in the study of randomized algorithms is that even when we fix a model of computation, there are many different ways that we can define randomized algorithms, and not all of these definitions are equivalent to each other. So with this warning in place, here are some of the most basic definitions that we will need:
Randomized Algorithms
Even at the very start of our exploration, we already face a non-trivial challenge: how exactly do we define what is a randomized algorithm? There are at least five possible (reasonable) definitions:
A randomized algorithm is…
- (Online randomness I) An algorithm that can flip a fair coin at any step of its computation.
- (Online randomness II) An algorithm that can flip a
-biased coin at any step of its computation for any choice of . - (Offline randomness) An algorithm that has access to a random string
where each is generated uniformly at random upon initialization of the algorithm. - (Distribution I) A distribution over deterministic algorithms.
- (Distribution II) A convex combination of deterministic algorithms.
All of these definitions are essentially equivalent: in the settings we will consider, any algorithm that satisfies one of these conditions can be converted into another one in any of the four other types. This means that we will switch between the different definitions of randomized algorithms to use whichever one is most convenient at the moment.
Roughly, when we try to design an explicit algorithm for a task, we will often use definitions 2, or 3. And when we want to prove lower bounds or general statements about randomized algorithms, Definition 4 will often be the most convenient to work with, and so this is what we will do.
Cost
The cost of the algorithms we consider in this course will depend on the computational model.
For a deterministic algorithm
Given a cost defined by
Worst-case cost.
The worst-case cost of the randomized algorithm
Average cost.
The average cost of
Distributional cost.
If we have a distribution
The different cost measures relate to each other in the following way.
Proposition. For every randomized algorithm
, and the inequality can be strict.
Proof. For every distribution
, . And we have equality when is the point mass distribution on the input that maximizes . The second inequality is by the definitions and to see how the worst-case and average-case costs of an algorithm can have arbitrarily large gap between them, consider the algorithm
that flips a coin at each step and does one unit of work if it lands heads, and halts if the coin lands tails. This algorithm has expected cost 1 but unbounded worst-case cost.
Success / Error
Randomized algorithms are particularly interesting to study when we allow them to err with some probability. To do so, we need to define what we mean by the success or error rate of an algorithm.
Aside. There are also other ways to measure the correctness of an algorithm that go beyond measuring error probabilities as we do below. We will see these alternative correctness measures in later lectures.
In all of the following definitions, we assume that the target function that the randomized algorithm is attempting to compute is a Boolean function
For a deterministic algorithm
The error probability of the randomized algorithm
We can now define two notions of overall error of
Worst-case error. The worst-case error of
Distributional error. When
As with measures of cost, there is a close relation between the two different measures of error.
Proposition. For every randomized algorithm
and function , when the maximum is taken over all distribution on
.
Complexity Measures
The cost and error measures are for individual randomized algorithms. But in general we will most often be interested not in those but in the inherent complexity of their target functions.
Informally, the randomized complexity of
Fix an error parameter
Worst-case complexity. The worst-case complexity of
Average-case complexity. The average-case complexity of
These are not the only two complexity measures we can study. For example, we can also study the following one:
Maximum distributional complexity. Define the distributional complexity of
These three complexity measures are comparable to each other.
Proposition. For every function
,
It’s a good exercise to see when the inequalities can be strict and what is the largest gap that you can have between these different measures.