5. Undecidability
We saw in the last lecture that the language \(A_{TM} = \{ \left< M \right>x : M \mbox{ accepts } x\}\) is undecidable. This is only one example of a fundamental limitation of Turing machines:
Turing machines cannot predict what other Turing machines will do without simulating them.
This is a theme that will reappear again later in the complexity part of the course. But for now, let’s see exactly what is meant by this statement in the context of decidability.
More Undecidable Languages
The proof of the undecidability of \(A_{TM}\) can easily be modified to show that the halting problem is undecidable.
Theorem 1. The language
\[\textsf{Halt}_{TM} = \{ \left< M \right>x : M \mbox{ halts on input } x \}\]is undecidable.
Proof. Assume on the contrary that \(\textsf{Halt}_{TM}\) is decidable by the Turing machine \(T\). Consider the machine \(M\) that on input \(x\) does the following:
- Obtain its own encoding \(\left< M\right>\) using the Recursion Theorem.
- Run \(T\) on input \(\left< M \right> x\).
- If \(T\) accepts, run forever in an infinite loop; otherwise, halt and accept.
By construction, \(M\) halts on \(x\) if and only if \(T\) does not accept \(\left< M \right> x\). But this contradicts the claim that \(T\) decides \(\textsf{Halt}_{TM}\).
We can also modify the proof to show that determining if a Turing machine accepts any string is also undecidable. Define \(L(M) = \{ x \in \{0,1\}^* : M \mbox{ accepts } x\}\); recall that \(L(M)\) denotes the language recognized by \(M\).
Theorem 2. The language
\[\textsf{Empty}_{TM} = \{ \left< M \right> : L(M) = \emptyset \}\]is undecidable.
Proof. Assume on the contrary that \(\textsf{Empty}_{TM}\) is decidable by the Turing machine \(T\). Consider the machine \(M\) that on input \(x\) does the following:
- Obtain its own encoding \(\left< M\right>\) using the Recursion Theorem.
- Run \(T\) on input \(\left< M \right>\).
- Accept if \(T\) accepts; otherwise reject.
By this construction, \(L(M) = \{0,1\}^*\) when \(T\) accepts \(\left< M \right>\) and \(L(M) = \emptyset\) when \(T\) rejects. This contradicts the claim that \(T\) decides \(\textsf{Empty}_{TM}\).
We can extend this theorem to show that it is impossible to decide any non-trivial property of languages of Turing machines.
Rice’s Theorem. Let $P$ be a subset of all languages such that
- There exists a Turing machine \(M_1\) for which \(L(M_1) \in P\), and
- There exists a Turing machine \(M_2\) for which \(L(M_2) \notin P\).
Then the language
\[\textsf{L}_{P} = \{ \left< M \right> : L(M) \in P \}\]is undecidable.
Proof. Assume on the contrary that \(L_P\) is decidable by the Turing machine \(T\).
Consider the machine \(M\) that on input \(x\) does the following:
- Obtain its own encoding \(\left< M\right>\) using the Recursion Theorem.
- Run \(T\) on input \(\left< M \right>\).
- If \(T\) accepts, simulate \(M_2\) on \(x\); otherwise simulate \(M_1\) on \(x\).
By this construction, \(L(M) = L(M_2) \notin P\) when \(T\) accepts \(\left< M \right>\) and \(L(M) = L(M_1) \in P\) when \(T\) rejects. This contradicts the claim that \(T\) decides \(L_P\).
Rice’s Theorem immediately implies Theorem 2 because \(P = \{\emptyset\}\) is a non-trivial property of languages of Turing machines. It also implies that we can’t decide if the language recognized by a Turing machine includes the empty string, is finite or infinite, contains an even number of strings of length at most 100, is a regular language, etc. But note that it does not imply that we can’t decide properties of Turing machines themselves; given a description of a Turing machine, we can certainly decide whether it has an even number of states, for example.
(Also, note that the undecidability of \(A_{TM}\) and \(\mathsf{Halt}_{TM}\) do not immediately follow from Rice’s Theorem because the languages \(A_{TM}\) and \(\mathsf{Halt}_{TM}\) are not of the right form.)
Reductions
We have seen two different types of arguments that can be used to prove the undecidability of a language: the Recursion Theorem method, and diagonalization arguments. There is a third method: reduction arguments. Informally, if we show that being able to decide language \(L\) implies that we can decide another language that is already known to be undecidable, like \(A_{TM}\), then this proves that \(L\) must also be undecidable.
Here is a simple example of a proof by reduction.
Theorem 3. The language
\[A_{TM}^{\varepsilon} = \{ \left< M \right> : M \mbox{ accepts } \varepsilon \}\]is undecidable.
Proof. Assume on the contrary that the Turing machine \(T\) decides \(A_{TM}^\varepsilon\).
Let us now define a Turing machine \(A\) that takes input \(\left< M \right> x\). Let \(M'\) be a Turing machine that first writes \(x\) on the tape and then copies the behaviour of \(M\). From the encoding of \(M\) and the string \(x\), \(A\) can determine the encoding of \(M'\). So it can call \(T\) on \(\left< M' \right>\) to determine whether \(M'\) accepts \(\varepsilon\) or not.
And since \(M'\) accepts \(\varepsilon\) if and only if \(M\) accepts \(x\), then \(A\) decides the language \(A_{TM}\), a contradiction.
Note that Theorem 3 also follows directly from Rice’s Theorem. Here’s another example that is a bit more interesting.
The \((m,k)\)th Busy beaver number is the maximum number \(bb(m,k)\) of steps that a Turing machine with \(m\) states and tape alphabet \(\Gamma_k = \{0,1,\ldots,k,\square\}\) can complete before halting on a tape that is initially empty. (Here the maximum is taken only over the Turing machines that do halt after a finite number of steps when the input is \(\varepsilon\).) The Busy beaver numbers grow incredibly fast as a function of \(m\) and \(k\). And they are uncomputable in the following sense.
Theorem 4. The language
\[BB = \{ \left< m \right> \left< k \right> \left< n \right> : bb(m,k) \le n \}\]is undecidable.
Proof. Assume on the contrary that there is a Turing machine \(T\) that decides the language \(BB\).
Let us now define a Turing machine \(A\) that takes input \(\left< M \right>\). As a first step, \(A\) uses the description \(\left< M \right>\) to determine the values of \(m\) and \(k\) for this particular machine \(M\). Then by calling \(T\) with input \(\left< m \right> \left< k \right> \left< n \right>\) for \(n=1,2,3,\ldots\) until \(T\) accepts, \(A\) can determine the value of \(bb(m,k)\). (Note that since there are only finitely many distinct Turing machines with \(m\) states and tape alphabet \(\Gamma_k\), the value of \(bb(m,k)\) is finite and so \(T\) will accept after a finite number of calls.)
Now \(A\) can simulate up to \(bb(m,k)\) steps of computation of \(M\) on input \(\varepsilon\). Specifically, it can do that by copying the behaviour of the Universal Turing Machine with one additional twist: a counter that is incremented after each simulation step and that interrupts the simulation when it reaches the value \(bb(m,k)\). If \(M\) accepts or rejects during the simulation, \(A\) does the same. Otherwise, at the end of \(bb(m,k)\) steps of simulation, \(A\) halts and rejects.
Then \(A\) decides the language \(A_{TM}^{\epsilon}\). That’s because if \(M\) accepts or rejects \(\varepsilon\), then \(A\) does the same. And if \(M\) runs for more than \(bb(m,k)\) steps, then by definition of the Busy beaver numbers it must run forever, which means that it does not accept \(\varepsilon\).
These two examples provide only a first glimpse into what we can do with reductions. We will discuss reductions in more detail in the next lecture and throughout the rest of the class.