5. Time Complexity
The set of decidable languages tells us what decision problems can be solved by computers that don’t have any restrictions on the amount of resources that they require to complete the computation.
We can also ask about which languages can be computed “efficiently” by adding some restrictions to the amount of resources that computers can use during their execution. Doing so leads us to complexity theory.
In this first lecture on complexity theory, we consider what is possibly the most natural resource that we might want to restrict: time.
Time Complexity Classes
The time cost of the Turing machine \(M\) on input \(x \in \{0,1\}^*\) is the number of computational steps that \(M\) performs on that input before it halts. When \(M\) does not halt on input \(x\), its time cost can be defined to be \(\infty\), but from now on we will only consider Turing machines that halt on all inputs.
From the time cost of a Turing machine on individual inputs, we obtain a global measure of its time cost.
Definition. The (worst-case) time cost of the Turing machine \(M\) is the function \(t_M : \mathbb{N} \to \mathbb{N}\) where \(t_M(n)\) is the maximum time cost of \(M\) on any input \(x \in \{0,1\}^n\) of length \(n\).
With this notion of time cost, we can refine the class of all decidable languages into various time complexity classes.
Definition. For every function \(f : \mathbb{N} \to \mathbb{N}\), the time complexity class \(\textbf{TIME}(f)\) is the set of all languages that can be decided by a Turing machine \(M\) with worst-case time cost \(t_M \le O(f)\).
By considering various functions, we obtain a number of natural time complexity classes:
Time Class | Definition |
---|---|
Constant time | \(\mathbf{TIME}(1)\) |
Linear time | \(\mathbf{LIN} = \mathbf{TIME}(n)\) |
Quasi-linear time | \(\bigcup_{k \ge 0} \mathbf{TIME}(n \log^k(n))\) |
Quadratic time | \(\mathbf{TIME}(n^2)\) |
Cubic time | \(\mathbf{TIME}(n^3)\) |
Polynomial time | \(\mathbf{P} = \bigcup_{k \ge 0} \mathbf{TIME}(n^k)\) |
Linear-exponential time | \(\mathbf{E} = \mathbf{TIME}(2^{O(n)})\) |
Exponential time | \(\mathbf{EXP} = \bigcup_{k \ge 0} \mathbf{TIME}(2^{n^k})\) |
Double exponential time | \(\mathbf{EEXP} = \bigcup_{k \ge 0} \mathbf{TIME}(2^{2^{n^k}})\) |
Iterated exponential time | \(\mathbf{ELEMENTARY} = \bigcup_{k \ge 0} \mathbf{TIME}(\underbrace{2^{2^{\cdots^{2^n}}}}_{k})\) |
Warning. Time complexity classes can also be defined in terms of other models of computation. In particular, time classes are often defined in terms of the minimum number of operations that multi-tape Turing machines execute before they halt. And this can matter: for some of the classes listed above, the class does not remain the same if we change the model of computation used in the definition.
Time Hierarchy Theorem
Once we have defined these time complexity classes, it’s natural to ask whether they are all distinct from each other. In other words, does giving Turing machines more time let them decide more languages?
In the case of linear and cubic time, we can use the Universal Turing Machine to show that the two classes are indeed distinct.
Theorem 1. \(\mathbf{TIME}(n) \subsetneq \mathbf{TIME}(n^3)\).
The idea of the proof is to use the two fundamental observations that we established in the first part of the course: Turing machines can simulate any Turing machine given its description \(\left< M \right>\), but they can’t predict what \(M\) will do given its description without essentially simulating it.
Proof. Consider the language
\[A_{TM}^{n^{2}} = \{ x = \left< M \right> 0^k : M \mbox{ accepts } x \mbox{ in } \le |x|^{2} \mbox{ steps}\}.\]We first want to show that the language \(A_{TM}^{n^{2}}\) is in \(\mathbf{TIME}(n^3)\). We already know that the Universal Turing Machine \(U\) can simulate \(M\) on \(x\). We can modify \(U\) to augment it with a timer initialized with the value \(|x|^2\) so that it only simulates that number of steps of \(M\) before halting the simulation. This variant of \(U\) accepts if and only if \(x \in A_{TM}^{n^2}\). And if we design this variant of \(U\) carefully and then analyze the number of steps that it needs to use to complete the simulation, we find that it can simulate each step of \(M\)’s computation in time at most \(O(n)\); overall it has worst-case time cost bounded by \(O(n^3)\). Therefore, \(A_{TM}^{n^2} \in \mathbf{TIME}(n^3)\).
We now want to show that \(A_{TM}^{n^{2}} \notin \mathbf{TIME}(n)\). Assume for the sake of contradiction that it is. Let \(T\) decide \(A_{TM}^{n^2}\) in linear time.
Consider the machine \(D\) that on input \(x = \left< M \right> 0^k\) runs \(T\) on input \(x\) and then accepts if and only if \(T\) rejects. The machine \(D\) also has linear time cost. What does \(D\) do on the input \(x = \left< D \right> 0^k\)? When \(k\) is large enough, then it always halts within at most \(|x|^{2}\) steps. So it accepts if and only if \(T\) rejects, contradicting the assumption that \(T\) decides \(A_{TM}^{n^{2}}\).
The same idea gives a much more general result known as the Time Hierarchy Theorem. But the statement of this theorem requires one more definition.
Definition. The function \(t : \mathbb{N} \to \mathbb{N}\) is time-constructible if there is a Turing machine that on every input of the form \(0^n\) for some \(n \ge 1\) writes \(t(n)\) on the tape and halts in time \(O(t(n))\).
All the functions in the table of complexity classes above are time-constructible. And they are all distinct, as the Time Hierarchy Theorem shows.
Time Hierarchy Theorem. For every pair of functions \(f, g \colon \mathbb{N} \to \mathbb{N}\) where \(f \log f = o(g)\) and \(g\) is time-constructible, \(\mathbf{TIME}(f) \subsetneq \mathbf{TIME}(g)\).
The proof of this version of the Time Hierarchy Theorem is essentially the same as the one we used for the special case above. But it does require two additional technical components:
- To obtain this version of the time hierarchy theorem, we need a more efficient version of the Universal Turing machine (with timer) which incurs only a logarithmic overhead in simulating each step of \(M\) (instead of a linear overhead, as we used in the proof of the special case above).
- And, more importantly, we also need to make sure that given the length \(n\) of the input \(x\), the machine \(U\) can initialize its timer with the value \(h(n) = g(n)/\log g(n)\). This is where time-constructibility comes in.
Is the time constructability condition in the theorem statement necessary? Maybe (perhaps with a slightly more clever approach) we could generalize the Time Hierarchy Theorem even further so that it holds for all pairs of functions \(f,g\) that satisfy \(f \log f = o(g)\)? No, we can’t! Time constructibility is a necessary condition in the Time Hierarchy Theorem. In fact, the (Time Hierarchy) Gap Theorem shows that it is possible to design a function \(f\) for which \(\mathbf{TIME}(f(n)) = \mathbf{TIME}(2^{f(n)})\). (Needless to say, this function \(f\) is not time-constructible; it also happens to be extremely fast growing.)
The Class P
The class \(\mathbf{P} = \bigcup_{k \ge 0} \mathbf{TIME}(n^k)\) of languages that can be decided by polynomial-time Turing machines plays a particularly important role in computer science. In fact, the class \(\mathbf{P}\) is the one that is often used to represent the class of decision problems that are “efficiently” computable. But why this class? In part, it’s because the class \(\mathbf{P}\) has two important properties that make it particularly well-suited for study: its closure properties, and its robustness.
The most important closure property of the class \(\mathbf{P}\) is that it is closed under subroutine calls. Let’s say that \(M\) is a Turing machine that decides a language but, to do so, simulates some (subroutine) Turing machines \(N_1, N_2, \ldots\) to solve some intermediate subproblems. Let’s say further that (1) when we only consider each call to one of the \(N_i\) machines as taking a single time step then \(M\) has polynomial time cost, and (2) each Turing machine \(N_i\) has polynomial time cost. Then \(M\) is a polynomial-time Turing machine.
What this closure property means is that we can use high-level descriptions of algorithms to define Turing machines (as we have already done on multiple instances). Then as long as the algorithm itself has polynomial time cost and each instruction of this algorithm can be computed in polynomial-time, then the result is a polynomial-time Turing machine. Note that the same property is not true of \(\mathbf{LIN}\) (or of many other natural time classes): if an algorithm has a linear number of instructions and each instruction can be completed in linear time, the overall time complexity of the algorithm can be as large as quadratic.
The robustness of the class \(\mathbf{P}\) lies in the fact that the Universal Turing Machine theorem can be extended to show that there is a Turing machine that can efficiently simulate (up to polynomial overhead) many other machine models as well. As a result, the set of languages that can be computed in polynomial time within those machine models are all contained in \(\mathbf{P}\). In fact, it is believed that this robustness property holds not only for some other machine models, but for all reasonable models of computation.
Cobham-Edmonds Thesis. Any decision problem that can be solved in polynomial time by an algorithm on any physically-realizable computer corresponds to a language that is in \(\mathbf{P}\).
(This thesis is quite a bit stronger than the Church-Turing thesis. For this reason it is also sometimes known as the strong Church-Turing thesis.)
Eric Blais ©2024 — Last edited on Jan. 19, 2024