17. Sublogarithmic Space

There are many natural languages that can be decided with logarithmic space complexity. There are also languages that can be decided with even less space.

Loglog Space Complexity

Counting is one of the basic problems that can be accomplished with a logarithmic amount of space. For example, consider the language \(\mathsf{Inc} \subseteq \{0,1,\texttt{#}\}^*\) defined by

\[\mathsf{Inc} = \{ x01^k\texttt{#}x10^k : x \in \{0,1\}^*, k \ge 0\}.\]

This language is the set of strings \(y\texttt{#}z\) where \(y\) and \(z\) are the the binary encodings of natural numbers that satisfy \(z = y+1\). It can be decided with a logarithmic amount of space.

Proposition. \(\textsf{Inc} \in \mathbf{L}\).

Proof. We only need a constant amount of space to verify that the input string is in the form \(y \texttt{#} z\) for some \(y,z \in \{0,1\}^*\). Then we can use a counter to check each symbol of the strings \(y\) and \(z\) one at a time to verify that they are in the format required to be in $\textsf{Inc}$.

We can build on this language to obtain one that has much smaller space complexity. Consider the language

\[\textsf{Count} = \{ 0^k\texttt{#} 0^{k-1}1 \texttt{#} 0^{k-2}10 \texttt{#} 0^{k-2}11 \texttt{#} \cdots \texttt{#} 1^k : k \ge 1\}\]

of all strings that include the binary encodings of all numbers \(0,1,2,3,\ldots,2^{k}-1\) for some \(k \ge 1\).

Proposition. \(\textsf{Count} \in \mathbf{SPACE}(\log\log n)\).

Proof. There is a natural algorithm for solving $\textsf{Count}$: check that the string before the first $\texttt{#}$ symbol is a string of the form $0^k$. Then use our algorithm for $\textsf{Inc}$ to check that each pair of binary strings separated by a $\texttt{#}$ symbol encode strings are the binary encodings of adjacent natural numbers. And finally check that the string ends on the string $1^k$. Since we can reuse the same amount of space for each $\textsf{Inc}$ check, the total space complexity of the algorithm is $O(\log k)$. And when the input string is in $\mathsf{Count}$, then its length is $n = 2^k(k+1) > 2^k$. This means that on those inputs, the space cost of the algorithm is $O(\log \log n)$.

There is one complication, however, that needs to be addressed to obtain a true $\log \log n$-space algorithm: for strings that are not in $\textsf{Count}$, we are not guaranteed that the length of the string is greater than $2^k$. For example, the input could be $0^k$, or $0^k \texttt{#} 0^{k-1}1$. On such inputs, we require the space cost of the algorithm to be much less than $O(\log k)$.

We can handle the complication by refining our algorithm so that it first checks that the input is of the form

\[z_0 0^\ell \texttt{#} z_1 0^{\ell-1}1 \texttt{#} \cdots \texttt{#} z_{2^{\ell}-1} 1^{\ell} \texttt{#} z_{2^\ell}\]

for some strings \(z_0,\ldots,z_{2^\ell} \in \{0,1\}^*\). This check can be done with $O(\log \ell)$ space. And doing so for $\ell=1,2,\ldots,k-1$ before running the original algorithm described above then guarantees that the space cost of the algorithm is $O(\log \log n)$ on all inputs of length $n$.

Sub-loglog Space Complexity

Are there languages that require even less than $O(\log \log n)$ space? There are indeed languages that can be solved with a constant amount of space. But that’s it! There are no languages with space complexity $o(\log \log n)$ and $\omega(1)$.

Theorem. For every function $s : \mathbb{N} \to \mathbb{N}$ that satisfies $s = o(\log \log n)$,

\[\mathbf{SPACE}(s) = \mathbf{SPACE}(1).\]

Proof. Let $M$ be any Turing machine with non-constant space cost $s = \omega(1)$. We will show that its space cost satisfies $s = \Omega(\log \log n)$.

Fix any $m \in \mathbb{N}$. Let \(x \in \{0,1\}^*\) be a shortest string for which $M$ has space cost at least $m$ and set $n = |x|$ to be the length of $x$.

Consider any position $i = 1,2,\ldots,n$ on the input tape. Every time the input tape head of $M$ crosses between the $i$th and $(i+1)$th positions on the input tape, its configuration (not counting the position of the input tape head) is defined by the internal state of $M$, the position of the work tape head, and the contents of the work tape. In total, there are

\[t = O(1) \cdot m \cdot |\Gamma|^m = 2^{O(m)}\]

such possible configurations. When running $M$ on $x$, we can list the configurations of $M$ as it crosses between the $i$th and $(i+1)$th position on the input tape to obtain the crossing sequence at $i$ of $M$ on input $x$.

The key observation is that the same configuration can never be repeated more than twice (once going left, and once going right) in any crossing sequence, otherwise $M$ would enter an infinite loop. So there can be at most $t^{2t}$ distinct crossing sequences on all the indices. Assume now that $n > 3t^t$. Then there must be at least 3 indices $i < j < k$ with the same crossing sequence, two of which must also have the same symbol on the input tape at that index. Say $i < j \in {1,2,\ldots,n}$ are two such positions. Then $M$ has the same space cost on the string $y = x_1\cdots x_i x_{j+1} \cdots x_n$ as it does on $x$, but this contradicts the assumption that $x$ is a shortest on which $M$ has space cost $m$. This contradicts our assumption, which means that we must therefore have

\[n \le 3t^t = 2^{2^{O(m)}}\]

or, in other words, that $m \ge \Omega(\log \log n)$.

Constant Space Complexity

Let’s now take a closer look at the class $\mathbf{SPACE}(1)$ of languages that can be decided with a constant space cost. Our first observation is that all the languages in this class can be computed with no use of the work tape at all. We make this precise by introducing 2-way Deterministic Finite Automata.

Definition. A two-way deterministic finite automaton (or 2DFA) is a Turing machine that has a read-only input tape but no work tape.

2DFAs are called two-way to emphasize that the finite automaton can still move the input tape head left or right on the input tape at each computational step. If we restrict the finite automata even more so that they can only move the input tape head to the right one cell at each computation step, we obtain the standard model of Deterministic Finite Automata (DFA).

Lemma. Every language in $\mathbf{SPACE}(1)$ can be decided with a 2DFA.

Proof. Let $M$ be a Turing machine with space cost at most $m$ on all inputs. There can be at most $m \cdot |\Gamma|^m = 2^{O(m)}$ possible positions of the work tape head and contents of the work tape. So by replacing each state in the state machine of $M$ with $2^{O(m)}$ states, we can ``store’’ the contents of the work tape and the position of the work tape head in the state machine. Since $m$ is constant, the resulting number of states is still constant, and we can now define the transition function using only the states to simulate the work tape. The result is a 2DFA that simulates $M$ on all inputs.

We can go one step further and show that every language that can be decided by 2DFAs can also be decided by regular (one-way) DFAs.

Lemma. Every language that can be decided by a 2DFA can also be decided by a DFA.

The class $\mathbf{REGULAR}$ is the set of languages that can be decided by DFAs.

As you may recall from other classes, the class of regular languages has a number of other equivalent characterizations. It is, for example, the set of languages that can be represented using regular expressions. The lemmas above give yet another characterization, showing that regular languages are exactly those languages that can be decided with constant space cost.

Theorem. $\mathbf{REGULAR} = \mathbf{SPACE}(1)$.


Eric Blais ©2024 — Last edited on Mar. 18, 2024

results matching ""

    No results matching ""