The Random Oracle Model -- What Does It Mean?

The following is a typical example of the kind of thing that can be proven in the random oracle model:

"Full domain hash is secure against an adaptive chosen message attack with respect to existential forgeries in the random oracle model, provided that it infeasible to invert a trapdoor one-way permutation, f."

The proof is actually a reduction: given a hypothetical forging algorithm for full domain hash, which has some specified probability of success, we construct an inverting algorithm, which inverts the function f on randomly chosen points, again with some specified probability of success. If we believe that it is infeasible to invert f, then we have a contradiction, and we conclude that the hypothesized forging algorithm cannot exist.

So far, this is all pretty standard, and not too difficult to follow. However, the proof is taking place in the "random oracle model" which means that some additional assumptions are being made, and the proof of security is valid only if these assumptions are valid.

The random oracle model is usually described by saying that the hash function (which is used in full domain hash in our example) is modelled as a random function. Of course, when full domain hash is actually used in practice, a particular hash function must be specified, and so the assumption used in the random oracle model is not valid.

Indeed, there is a particular protocol which has been proved secure in the random oracle model, but which becomes insecure whenever the hash function used in the protocol is specified (this is a result of Canetti, Goldreich and Halevi). As a result, the random oracle model is somewhat controversial. Shoup has gone so far as to characterize proofs in the random oracle model as "heuristic proofs". (It is not clear what the phrase "heuristic proof" means, but it is undoubtedly not good.)

The correct way to interpret a proof of security for a protocol P in the random oracle model is to view it as a proof of security against certain types of attacks on the protocol P. More precisely, the proof shows that the protocol P is secure against what might be termed "hash-generic" attacks. This means that any attack which treats the hash function as a random function will not be successful (regardless of whether the hash function is actually a rndom function). In other words, it is better to think of a proof in the random oracle model as a proof in which we make an assumption about the attacking algorithm rather than an assumption about the hash function (which, as we already mentioned, cannot be valid).

I suggest the term "hash-generic" attack to suggest an analogy with algorithms in the generic group model. Suppose we consider the discrete logarithm problem as a typical example. An algorithm to solve the DLP in a generic group is allowed to access group elements only via a random encoding algorithm, which encodes elements of the cyclic group (Z_n,+) injectively as random bit-strings, say. A "group-generic" algorithm is not allowed to make any assumptions about encodings of group elements other than those elements that it has queried through a group oracle. This is analogous to a "hash-generic" algorithm that is only allowed to obtain values of a hash function via a random hash oracle; the algorithm is not permitted to make any assumptions about values of the hash function at any points other than the ones at which it queried the oracle.

It was proven by Shoup that the DLP in (Z_n,+) cannot be solved by a group-generic algorithm in time less than the square root of n. This does not prove anything about the difficulty of solving the DLP in any specific group by a non-generic algorithm. For example, the index calculus algorithm can solve the DLP in (Z_p,*) faster than a generic algorithm can. In the case of appropriate elliptic curve groups, the only known algorithms for the DLP are generic algorithms. Shoup's result on the complexity of generic algorithms tells us that any improvement to the complexity of solving the DLP in elliptic curves must come from non-generic algorithms. The complicated nature of the encoding used in elliptic curves (with respect to the group operation) can be taken as evidence (not a proof, of course) that the DLP in elliptic curves is hard.

A security proof in the random oracle model must be interpreted in a similar light. As mentioned above, a security proof establishes that a protocol is secure against hash-generic algorithms. It is of course possible that an algorithm can break the protocol for some particular hash functions (or even for all possible hash functions) by somehow taking advantage of how the hash function is computed. The proof in the random oracle model can be taken as evidence of security when the random oracle is replaced by a particular hash function. This was the "thesis" proposed by Bellare and Rogaway when they introduced the random oracle model, and it should be stressed that no practical protocol proven secure in the random oracle model has been broken when used with a "good" hash function, such as SHA-1 (the protocol used in the Canetti-Goldreich-Halevi result was not a natural protocol for a "reasonable" cryptographic application; rather, it was designed explicitly for the purposes of their proof).