18. Axioms & Computability

Review of First-Order Logic

Let’s review what we have accomplished so far in our study of first-order logic.

The syntax of first-order logic allows us to encode many different mathematical statements in sentences. Structures let us connect these sentences to interpretations of their symbols over some fixed domain. With these interpretations, we can evaluate a sentence to obtain a True or False value with respect to the structure.

A structure that satisfies a set $\Gamma$ of sentences that we call axioms is called a model of this set of axioms. A sentence $\phi$ that evaluates to True with respect to every model of a set of axioms $\Gamma$ is a theorem. We write $\Gamma \vDash \phi$ when this is the case and say that $\phi$ is a logical consequence of $\Gamma$.

The formal proof system that we have introduced gives us a different way to relate sets of axioms to individual sentences. The set of axioms $\Gamma$ implies the sentence $\phi$, which we denote by $\Gamma \vdash \phi$, when there is a formal proof in our system with premises in $\Gamma$ and conclusion $\phi$.

These various ingredients and the Completeness and Soundness Theorems give us a method for establishing new mathematical truths:

Formalize the target statement as a first-order logic sentence $\phi$.
Find a set of axioms $\Gamma$ for which the interpretation of interest is a model.
Prove that $\Gamma \vdash \phi$.
Use the Completeness Theorem to conclude that $\Gamma \vDash \phi.$

The Soundness Theorem implies that whenever $\phi$ is a logical consequence of $\Gamma$, then there is a formal proof showing that $\Gamma \vdash \phi$ so that the method (in principle!) can be applied to obtain any mathematical truth. And the real power of this method is that if we do find a proof of our intended logical consequence $\Gamma \vDash \phi$ (as well as a proof that the structure of interest is a model of $\Gamma$), then we have conclusive evidence of its truth: to convince anyone else of its truth, all we need to do is send them the proof, and they can easily check (by hand in a straightforward if boring way, or algorithmically with a simple program) that all the derivation rules in a proof have been correctly justified and, if so, that the statement is indeed true.

Of course, finding the proof showing that $\Gamma \vdash \phi$ may not be an easy task. In fact, the task of identifying a good set of axioms $\Gamma$ in this method may not be easy either. Let’s examine this last issue a bit more closely.

A Wishlist for Axioms

So far, we have introduced a few different sets of axioms to study various mathematical topics like arithmetic, group theory, set theory, and geometry. But in each of these cases, the sets of axioms that we introduced are only one of many possible alternatives for the study of these topics. So let’s take a step back and ask: what makes a good set of axioms? There are at least four properties that we would like a set of axioms to satisfy:

Computability,
Consistency,
Soundness, and
Completeness.

Let’s discuss each of these properties individually.

1. Computability

There is one subtle but critical caveat in the claim above about the ease with which we can check proofs: the correctness of all the basic rules can indeed be checked mechanically, but the use of the premise rule PR can only be checked efficiently when we can easily recognize what is an axiom and what is not!

This critical issue is handled when we have a finite set of axioms. In this case, the algorithm simply compares the formula on a PR line with each axiom in the list and accepts if and only if it is equal to one of them. But there are many mathematical theories that (provably) do not have finite sets of axioms. Even the simple Presburger (addition-only) arithmetic setting has infinitely many axioms in the form of the principle of induction axiom schema.

So the condition that we want our axioms to satisfy, to avoid being overly restrictive, can be stated directly as that there must exist an algorithm that can always decide if an arbitrary string represents a sentence in the set of axioms or not. When such an algorithm exists, we say that the set of axioms is computable.

And we really can’t over-emphasize how important the computability property is for any set of axioms that we consider: without it, there really is no point in writing formal proofs since we have lost the ability to check them and therefore to unambiguously be able to agree on which proofs are correct.

2. Consistency

Another critical property that we want our sets of axioms to satisfy is (internal) consistency. We really don’t want the set $\Gamma$ of axiom to be able to derive a contradiction. Or, writing the same thing symbolically, we want to make sure that $\Gamma \not\vdash \bot.$

Consistency, like computability, is an absolutely necessary property for any interesting set $\Gamma$ of axioms to hold. Otherwise, as we have already seen, every sentence is implied by $\Gamma$ and so this set of axiom is completely trivial.

3. Soundness

A set $\Gamma$ of axioms is sound with respect to the structure $S$ if every sentence $\phi$ that is a logical consequence of $\Gamma$ evaluates to True under $S$. This is the case if and only if $S$ is a model of $\Gamma$.

And, of course, if our interest is in the structure $S$, we only want to consider sets of axioms that are sound with respect to it.

Note that soundness and consistency are related to each other, but they are not equivalent. In fact, soundness is a strictly stronger property in the following sense:

Proposition. If $\Gamma$ is sound with respect to $S$, then it is consistent.

4. Completeness

A set of axioms $\Gamma$ is complete if for every sentence $\phi$, at least one of $\phi$ or $\neg \phi$ is implied by $\Gamma$. That is, $\Gamma \vdash \phi$ or $\Gamma \vdash \neg \phi$.

When $\Gamma$ is both consistent and complete, then exactly one of the implications $\Gamma \vdash \phi$ and $\Gamma \vdash \neg \phi$ holds for every sentence $\phi$.

And when $\Gamma$ is both sound with respect to $S$ and consistent, then for every sentence $\phi$ that evaluates to True in $S$, we have a proof $\Gamma \vdash \phi.$

Warning. Even though we use the word complete as an adjective for both formal proof systems and sets of axioms, the word has very different meanings in these two contexts. For proof systems, it refers to the fact that every logical consequence has a corresponding proof. And for sets of axioms, it refers to the generality of the chosen axioms: there are certainly many sets of axioms that are incomplete in first-order logic even though our formal proof system is itself complete!

Is completeness a required property for the sets of axioms that we consider, in the same way that computability, consistency, and soundness were? Perhaps not, as a set of axioms that is not complete can still be used to prove many logical consequences of interest. But it is definitely a desired feature: we would like to know that, once we have picked the “right” set of axioms, we can then use it to prove everything about a given mathematical theory using it. Going back to the discussion at the beginning of the lecture, it would mean that the only thing that mathematicians would have to focus on once this step is accomplished for their area of interest is to try to construct the proof for any statement of interest (or its negation!), safe in the knowledge that such a proof does exist.

A Brief History of Axiom Systems

Are there any sets of axioms for interesting mathematical theories that do satisfy all four of the items in our wish list?

Yes!

Basic Arithmetic

When considering only addition (but not multiplication) over the natural numbers, we can obtain a set of axioms for this theory by taking the subset of the Peano axioms consisting of

\[\begin{align*} &\forall x \neg S(x) = 0 \\ &\forall x \forall y \, (S(x) = S(y) \to x = y) \\ &\forall x \, (x+0) = x \\ &\forall x \forall y \, (x+S(y)) = S(x+y) \end{align*}\]

and the (infinitely many) axioms generated from the Principle of Induction axiom schema. This set of axioms is computable. It is consistent. It is sound with respect to the natural model $N$ of addition on $\mathbb{N}$. And Presburger showed that it is also complete.

Geometry

In geometry, we also have a similar success story. All of planar Euclidean geometry can be formalized in a first-order language that defines a ternary “betweenness” relation (where $B(x,y,z)$ is meant to represent the assertion that $y$ is on the line segment connecting $x$ and $z$) and a 4-ary relation of “congruence” (where $C(x,y,z,w)$ is meant to represent the assertion that the line segment $xy$ has the same length as the line segment $zw$.)

Tarski gave a set of axioms for this theory that is computable, consistent, sound with respect to the model defined by Euclidean geometry of points on the plane. And he showed that this theory is also complete.

Set Theory?

Inspired by the success stories above, we might then want to look for axioms that satisfy all four of the properties in our wish list for set theory. There has in fact been much work in attempting to do that, including some notable efforts by Frege and by Russell and Whitehead, because a success in this one area of mathematics would have extremely wide-ranging implications. Indeed, in parallel with the work to axiomatize set theory, there was also a lot of work which showed that many other branches of mathematics can be formulated in terms of set theory. So a good set of axioms for set theory would in fact provide a solid foundation not just for that field but for much of mathematics itself.

However, it is not to be: Gödel’s famous First Incompleteness Theorem shows that we cannot hope to have axioms for set theory that simultaneously satisfies all four of the properties on our wish list.

Gödel’s First Incompleteness Theorem. (Special case) There is no set of axioms that is computable, consistent, sound with respect to set theory, and complete.

And, not only that, the same is true for most of the other mathematical theories, including arithmetic over the natural numbers when we consider both addition and multiplication.

We will revisit and discuss this theorem in more details in a few lectures. But before we can do that, we need to spend a bit of time formalizing the notion of computability. Proving that something is computable is straightforward: giving an algorithm (or a program in your favourite programming language) suffices. But to prove that something is not computable seems more challenging. We want to carefully define exactly what we mean by computability so that (i) we can indeed prove that some sets of axioms are uncomputable, as required for the proof of the incompleteness theorem, and (ii) our notion of computability is general enough so that uncomputability really captures the impossibility of writing any algorithm (in any programming language!) for the task.

We will do so with the help of Turing machines.