On this page:
5.1 De Bruijn indices
5.2 A calculus of holes
5.3 Hiding the calculus
5.4 Unification
8.5

5 Interaction In Predicate Logic

\(\newcommand{\la}{\lambda} \newcommand{\Ga}{\Gamma} \newcommand{\ra}{\rightarrow} \newcommand{\Ra}{\Rightarrow} \newcommand{\La}{\Leftarrow} \newcommand{\tl}{\triangleleft} \newcommand{\ir}[3]{\displaystyle\frac{#2}{#3}~{\textstyle #1}}\)

We is OK, from our fear.
(Wake Up, Essential Logic, 1979)

This optional chapter partially redresses a deficiency in Proust as we left it at the end of the chapter on predicate logic. With propositional logic, we had a useful way of using Racket’s REPL to refine holes in proofs, gradually filling them in until they were complete. We could not manage that interaction with predicate logic. Instead, we had to use global definitions, submitting each one in complete form. In the chapter on proof assistants, we saw that both Agda and Coq provide interaction, in different forms, Agda with holes and refinement combined with pattern matching, and Coq using tactics.

In this chapter, I will show how to develop Proust so that it can handle the simplest cases of refinement in Agda. Even that minimal extension will double the size of the program. This won’t really advance your knowledge of logic, computation, and their relationship; nor will it result in a tool that is even as usable as Proust for propositional logic. It’s more of a proof of concept, and a case study in some of the principles underlying real-world proof assistants. Hopefully it will help you use such tools more effectively.

The main difficulty is that the use of dependent types can create dependencies among holes in a program, of a type that did not arise with propositional logic. The solution is a calculus of holes that treats holes as binding constructs, and handles them appropriately. But before we get to that, there is another potential obstacle to remove. Using explicit names for variables gave us some trouble earlier. It will give us more trouble here, and it isn’t what modern proof assistants do, so we’ll start by shifting to one of the alternatives in common use.

5.1 De Bruijn indices

I described de Bruijn indices briefly in the chapter on predicate logic. They were introduced by Nicolaas de Bruijn for use in his pioneering Automath proof assistant in 1970. In designing Automath, de Bruijn independently described many ideas in dependent types (Howard of the Curry-Howard isomorphism being the other person credited). de Bruijn indices are very useful in proof formalization (his original 1972 paper showed how a major result could be simplified using them), but they also are used in compilers and interpreters for functional programming languages. As we’ll see below, they are really not human-readable, which is why we continue to use names in programming. Research papers and other expositions also use names, but sometimes they will say that they are really using de Bruijn indices, which means they are trying to make their inference rules readable without all the messiness that would be required if they actually filled in all the details necessary to use names.

(Speaking of names, if you say "d’Brown", you get a reasonable approximation to the Dutch pronunciation of de Bruijn’s name.)

The basic idea is simple. Binding constructs, such as lambda and for-all, do not mention a variable name. Instead, each variable reference (occurring at a leaf of the AST) "points" to the binder being referred to. This "pointer" is a natural number (starting at 0, of course) specifying the number of binders that should be skipped when moving from the leaf towards the root of the AST before arriving at the referenced binder.

Here are some examples. The identity function, \(\la x . x\), becomes \(\la . 0\).
\(\la x . \la y. y\) becomes \(\la . \la . 0\).
\(\la x . \la y . x\) becomes \(\la . \la . 1\).
The Church numeral for two, \(\la s . \la z . s~(s~z)\), becomes \(\la . \la . 1~(1~0)\).

You should now be able to see the advantage of using de Bruijn indices. Shadowing is not possible, so substitution doesn’t have to worry about variable capture, and the side conditions about freshness will go away. How do we represent a free variable? References to bound variables are always less than the number of enclosing binders, so all the rest of the natural numbers can be used for free variables. The excess (after being diminished by the number of enclosing binders) becomes an index into the context. A context used to be a list of name-type pairs; now it is just a list of types. For example, if we have term \(\la x . z\) in context \(x:A,y:B\), then using de Bruijn indices, the term would be \(\la . 2\) and the context \(A,B\).

The big disadvantage of de Bruijn indices is their lack of human readability. They’ve been described as "Martian". But they’re a perfectly fine internal format (heavily used as such in interpreters and compilers for functional programming languages), and it’s not hard to translate human-readable syntax into this format, and back again (if a list of names for indices is provided).

A conventional presentation of this idea focusses next on substitution as needed to define reduction. But this introduces new complications. It’s natural to define substitution by structural recursion on the term being substituted into, and that’s what we did before. As we pass under a binder in that recursion, and add to the context, the index of the variable being substituted for goes up by one (it starts at zero at the beginning of the operation), and we have to keep track of that (before, it was just a name that didn’t change). But, also, the indices of the free variables in the term being substituted have to go up by one, to point correctly into the new context. That requires another structural recursion on the term being substituted. And here’s the problem: that second recursion passes under binders in the term being substituted. So a threshold must also be maintained, such that indices below that threshold (which represent bound variables in the original term being substituted, but which look free because binders have been removed) shouldn’t be adjusted.

This sounds messy, because I’m explaining it quickly. It’s not that bad, and you might enjoy working out the details. But instead of doing this, we’re going to use another clever idea at this point for more clarity. It’s due to Conor McBride, from the brief unpublished draft paper "Type-preserving renaming and substitution" (several published followup papers use the idea, with credit). McBride noticed the similarity between renaming as described in the last paragraph and a more general version of substitution. What we did before, now called single substitution in the named case, consumes a term to be substituted into, a variable name, and a term to be substituted, and produces a term. Now single substitution consumes a term to be substituted into in some context \(\Gamma,B\), and a term to be substituted for the variable 0, and produces a term in context \(\Gamma\).

To make substitution look like renaming, we generalize it to substitute for more than one variable. We replace the term to be substituted for the variable 0 with a map from lookups over \(\Gamma\) (that is, many variables or natural numbers) to terms (over another context \(\Delta\)) to be substituted for them. If we then move the term being substituted into to be the last argument, and notice that we can parenthesize this and the result type (because \(\ra\) is right-associative), the result type becomes a map from terms in context \(\Gamma\) to terms in context \(\Delta\).

Renaming consumes a map from lookups in context \(\Gamma\) to lookup in context \(\Delta\), and produces a map from terms in context \(\Gamma\) to terms in context \(\Delta\). That is, it lifts a map (from variables to variables) to a map from variables to terms. Our general substitution lifts a map (from variables to terms) to a map from terms to terms. Now the similarities are more evident.

McBride’s paper abstracts away the differences between renaming and general substitution to create a single operation of which these are special cases. It also represents a term by the proof that it has a given type (this is possible because type judgments, as we’ve seen, are structural). We won’t do that here, but it’s a great idea, especially when working in a statically-typed language like Agda. There are some binding constructs one can define that bind two or more variables at once (for example, a case construct for a datatype that has constructors with multiple arguments) and this is also easy to cleanly handle using general substitution.

The book Programming Language Foundations in Agda (PLFA), described more fully in the Resources chapter, uses these ideas to formalize type theory for a lambda-calculus language, but implements rename followed by general substitution as two separate but very similar functions. Both McBride’s paper and PLFA use renaming as a helper in general substitution (it is used when passing under a binder, because the context changes, so terms in the old context have to be lifted to the new context). Single substitution is the special case where the map is the identity function on everything but variable 0, and the two contexts are \(\Gamma,B\) and that context with its most recent entry removed, namely \(\Gamma\).

Let’s look at the code that results from using McBride’s generalization, with the starting point being our minimal predicate logic previously linked as proust-pred-forall.rkt. Each of renaming and general substitution (called psubst below, for "parallel substitution") uses an associated helper function to handle going under a binder in the recursion.

; A Var is a natural number (starting at 0)
 
; ext : (Var -> Var) -> (Var -> Var)
; Given vvmap which maps variables in context Γ to variables in context Δ,
; produces the resulting map from variables in context Γ,B to variables in context Δ,B
 
(define (ext vvmap)
  (lambda (v) (if (zero? v) 0 (add1 (vvmap (sub1 v))))))
 
; rename : (Var -> Var) Expr -> Expr
; Given vvmap which maps variables in context Γ to variables in context Δ,
; and expr in context Γ, produces the resulting expression in context Δ
; (contexts are implicit, they're not needed in the computation)
 
(define (rename vvmap expr)
  (match expr
    [(Lam x b) (Lam x (rename (ext vvmap) b))]
    [(App e1 e2) (App (rename vvmap e1) (rename vvmap e2))]
    [(Var n) (Var (vvmap n))]
    [(Def _) expr]
    [(Ann e t) (Ann (rename vvmap e) (rename vvmap t))]
    [(Arrow x t1 t2) (Arrow x (rename vvmap t1) (rename (ext vvmap) t2))]
    [(Type) expr]))
 
; exts : (Var -> Expr) -> (Var -> Expr)
; Given a map from variables to expressions in context Γ,
; produces the resulting map from variables to expressions in context Γ,B
 
(define (exts vtmap)
  (lambda (v) (if (zero? v) (Var 0) (rename add1 (vtmap (sub1 v))))))
 
; psubst: (Var -> Expr) Expr -> Expr
; Given vtmap which maps variables to expressions in context Γ,
; and expr in context Γ, produces the resulting expression (parallel substitution)
 
(define (psubst vtmap expr)
  (match expr
    [(Lam x b) (Lam x (psubst (exts vtmap) b))]
    [(App e1 e2) (App (psubst vtmap e1) (psubst vtmap e2))]
    [(Var n) (vtmap n)]
    [(Def _) expr]
    [(Ann e t) (Ann (psubst vtmap e) (psubst vtmap t))]
    [(Arrow x t1 t2) (Arrow x (psubst vtmap t1) (psubst (exts vtmap) t2))]
    [(Type) expr]))
 
; substitution for (Var 0) and shifting down of all other variables,
; which is what's needed for beta-reduction
 
(define (subst repl expr)
  (psubst (lambda (var) (if (zero? var) repl (Var (sub1 var)))) expr))

This code is included in proust-pred-forall-db.rkt, linked in the Handouts chapter. For completeness, that file also includes an implementation of McBride’s original single function followed by the specialized uses of it for rename and general substitution, and a more conventional implementation based on the code in Types and Programming Languages, a widely-used textbook by Benjamin Pierce. The latter implementation uses arithmetic on indices and thresholds as described above.

We’re not going to make the user write their terms using de Bruijn indices. The conversion is done during parsing, with the aid of an additional parameter which maintains a list of binders seen so that variable references can be converted to numbers. We also don’t want to make the user read de Bruijn indices, especially ones they haven’t written themselves, so we’ll keep the names around in the structs associated with binders and paired with the types in a context. They just won’t be used for the purposes of substitution and lookup, only for pretty-printing. We still need to freshen, because users may reuse variable names, and we want to disambiguate in error messages.

The variable names in the AST should be ignored for the purpose of alpha-equivalence. Without them, we could just use Racket’s equal? function. Racket structs have many features, and one of them permits the creation of custom equality tests. But the code is more clear (though less efficient) if we simply erase names before testing equality.

Having done all this, the code for type synthesis and checking is much clearer and closer to the inference rules.

(define (type-check ctx expr type)
  (match expr
    [(Lam x b)
       (type-check ctx type (Type))
       (match (reduce type)
         [(Arrow _ tt tw) (type-check (cons (list x tt) ctx) b tw)]
         [else (cannot-check ctx expr type)])]
    [else (if (equiv? (type-synth ctx expr) type) true (cannot-check ctx expr type))]))
 
(define (type-synth ctx expr)
  (match expr
    [(Var n) (lookup ctx n)]
    [(Def x)
       (cond
         [(assoc x deftypes) => second]
         [else (cannot-synth ctx expr)])]
    [(Lam x b) (cannot-synth ctx expr)]
    [(App f a)
       (match (reduce (type-synth ctx f))
         [(Arrow _ tt tw) #:when (type-check ctx a tt) (subst a tw)]
         [else (cannot-synth ctx expr)])]
    [(Ann e t) (type-check ctx t (Type)) (type-check ctx e t) t]
    [(Arrow x tt tw)
       (type-check ctx tt (Type))
       (type-check (cons (list x tt) ctx) tw (Type))
       (Type)]
    [(Type) (Type)]))

There is one final complication, which illustrates the possibility of subtle bugs when working with de Bruijn indices. Lookup seems simple, since contexts are just lists of types and variable references are just numbers denoting a position in the list. But when a type was added to a context, it might have its own variable references to that context. The context might have been added to subsequently, that is, the lookup might be taking place in an extended context. The retrieved type might not have correct variable references relative to the extended context. The solution is to use rename to appropriately adjust the variable references in the retrieved type on the way back from the lookup recursion.

(define (lookup ctx n)
  (cond
    [(empty? ctx) (error 'lookup "empty context, index ~a\n" n)] ; shouldn't happen
    [(zero? n) (rename add1 (second (first ctx)))]
    [else (rename add1 (lookup (rest ctx) (sub1 n)))]))

Exercise 45: Extend the code in proust-pred-forall-db.rkt to handle equality, Booleans, and natural numbers. \(\blacksquare\)

This is not the only possible approach. Lena Magnusson’s 1995 PhD thesis on ALF, the precursor to Agda, described a name-based notion of explicit substitutions to handle the situations described in this chapter. Jamie Gabbay and co-authors have developed what they call "nominative techniques" to work with names. There are also many variations on de Bruijn indices suitable for certain applications. We will stick with the original idea.

5.2 A calculus of holes

Holes are often referred to as metavariables in discussions of how to handle them, because they represent unknowns, and we may wish to substitute an arbitrary expression containing other metavariables rather than a final answer. The property that made our lives easier in Proust for propositional logic was the ability to substitute any type-correct expression into a hole, independent of the other holes in the proof. This property does not hold in the presence of dependent types.

For a simple example of this, consider the polymorphic identity function, which earlier we called \(\mathrm{id}\). Consider the type judgment \((\mathrm{id}~?) ~:~ ? \ra ?\). Each hole has type \(\mathrm{Type}\), but if one is filled with an expression without a hole, then the others must be filled with the same expression. Filling two or three with different expressions with holes in them might lead to a situation where all the holes can be filled, or might lead to a situation in which no solution is possible.

It’s difficult to see how this might be handled. Many solutions have been proposed, with various theoretical and practical properties. One particularly effective basis was provided by Conor McBride in their 1999 PhD thesis. McBride was working with a system called LEGO (in defiance of trademark), and called their calculus of holes OLEG, as it was a "slight rearrangement" rather than a drastic rewrite of the system.

McBride’s ideas were put to use in the dependently-typed programming language Idris 1 (2009-present), as described by its creator Edwin Brady in a 2013 paper. They are also mentioned in Ulf Norell’s 2007 PhD thesis describing Agda 2 (the current version) as being the basis of Agda’s handling of metavariables. All of these sources are eminently readable; McBride’s thesis in particular is full of wit, delivered in a distinctive voice. But they are written for other experts, and contain much detail that can be intimidating at first.

As with the earlier versions of Proust we have looked at, I originally wrote the versions in this chapter in order to more fully understand the ideas behind OLEG, since tutorial implementations are scarce. I will describe three iterations of Proust for predicate logic with holes, with just enough detail to let you understand my code. I hope this exposition will help you if you decide to consult the original sources, or related work in this area.

We start by adding a let construct to the term language. We’ve discussed this before. It’s equivalent to an immediate application of a lambda (which needs a type annotation in our bidirectional typing system) and the type synthesis rule directly follows from this. Here’s the grammar so far.

  expr = (λ x => expr)
  | (expr expr)
  | (let x = expr in expr)
  | (expr : expr)
  | x
  | ( (x : expr) -> expr)
  | (expr -> expr)
  | Type

And here’s the resulting type synthesis rule.

\[\ir{(\mathrm{let})} {\Gamma\vdash t\Ra T ~~~ \Gamma,T\vdash w\Ra W} {\Gamma\vdash \mathrm{let}~x = t ~\mathrm{in}~ w \Ra W[x\mapsto t]}\]

Reduction of a let proceeds pretty much as expected; we do the substitution and reduce the result.

McBride’s insight was that a hole should be treated as a binding construct. The place where it is created is a binding occurrence, and the presence of dependent types introduces a need to refer to that, that is, to introduce bound occurrences of variables. In Proust for propositional logic, holes were leaves of the AST, but here they can be defined at internal nodes, and their scope, as with other binding constructs, is lexical. That is, the scope of a hole is the subtree of the node where the hole is defined, or the body expression of the hole. This restricted scope helps tame the issues caused by holes.

As in Proust for propositional logic, a hole will be represented in the user language with a single question mark, and we will assign it a unique number using a counter. This number is like a variable name; it’s for the user’s convenience, and internally we’ll use de Bruijn indices. The AST node for a hole has the hole number, the type of the hole, and the body expression (subtree). For human-readable notation, we’ll specify hole \(x\) with type \(T\) and body expression \(e\) as \(?x:T~.~e\). (Both McBride and Brady use this notation.)

The typing rule for holes is not surprising.

\[\ir{(\mathrm{hole})} {\Gamma,T\vdash e\Ra W} {\Gamma\vdash ?x:T~.~e \Ra W}\]

McBride observes that problems compound when a type on the right side of a colon or directional arrow in a typing judgment contains a binding occurrence of a hole. Such a type can refer to a hole, but it should not define one. The manipulations of the hole calculus yet to be defined will avoid this.

Our intention with holes is to eventually fill them in completely. To that end, we introduce a variation on a hole, called a guess. A guess will also have a partially constructed term. We add this into our human-readable notation with a wiggly equal sign, like this: \(?x\approx t:T~.~e\).

A guess resembles a let, and when the guess is suitably complete, we "fill in the hole" by converting it to a let (hence the reason for introducing let at this point). However, unlike a let, there is no computational rule for a guess. This is one of the ways in which holes are prevented from migrating to the wrong side of a type judgment.

Here’s the typing rule for guesses.

\[\ir{(\mathrm{guess})} {\Gamma,T\vdash e\Ra W} {\Gamma\vdash ?x\approx t:T~.~e \Ra W}\]

At any point during an interaction, there will be a current proof term under construction, and one of its holes will be the current focus. The user can’t just type anything into that hole; they will be restricted to a set of operations that we define. An operation will update the proof term and possibly the focus. But they may use an operation in a manner that fails partway through. So we also need a notion of local proof state (term and focus), which we bundle into a struct.

The first iteration of Proust with holes is linked on the Handouts page as proust-pred-forall-db-holes-basic.rkt. After the state variables, the first function to be defined is new-def, which the user will invoke in the REPL to initialize a new proof.

The new-def function parses and validates the type \(T\) given, and then sets up the initial proof term as \(?x:T.x\) (using a deBruijn index instead of the name \(x\)). This is as generic as we can make an empty hole, and it also is the form that some of the later operations (which we will call tactics) expect.

The next section of the program sets up common infrastructure for tactics to use. Some tactics will operate on a single hole: for example, partially filling it in, or printing the context. Some tactics will operate on all holes: for example, printing the type of each goal.

A single generic function, tactic-r, does this kind of work. It recursively descends into the tree, maintaining context as it goes, and then does the appropriate thing at a hole (depending on the setting of various parameters). It is a bit long, but uses straightforward structural recursion. As written, it assumes that the tactic will change the subtree associated with a hole, so it produces the new tree as a value, reconstructing it on the way back from the recursion.

This seems inefficient, but it is easy to understand. Brady used this same idea in his 2013 paper on the design and implementation of Idris 1. The actual implementation of Idris 1 added one more data structure to improve efficiency. In a tree, it is easy to move from a node to one of its children in constant time. A zipper is a simple data structure that allows constant-time moving from a node to its parent as well. This is achieved by maintaining a list of the subtrees not descended into, and then restoring from the latest entry on that list in order to recreate the parent node.

We could also use the idea from Proust for propositional logic to have a global hash table or array of holes, with associated information. The challenge becomes how to effectively update that information. Brady used this approach in Idris 2 (2020-present), and Agda 2 does something similar. Both of these languages are implemented using Haskell, which is a pure lazy language, so the solutions look somewhat different from our Racket code.

Sometimes we don’t want a new tree, but a value that results from doing something in a hole (for example, if we want the context associated with a hole). Sometimes we don’t want a value at all (for example, if we want to just print the context for the user) but we can return a useless value. We could write another function that is set up to return an arbitrary value from a hole, but I didn’t want to write out another slightly-changed version of tactic-r, and you wouldn’t want to look at it.

Instead, I pulled out some heavy artillery from the Racket implementation, available in a library module imported at the start of the program. Continuations are a powerful and useful feature that let one capture a chunk of future computation as a lambda. Escape continuations are a particularly effective and simple use of this idea. They let one escape from a computation back to some predefined point. If we write Racket code that looks like this:

(call-with-escape-continuation
    (lambda (k)
       body-expr))

then Racket starts evaluating body-expr. But if k is applied to a value v during that evaluation, the result will be that the evaluation of body-expr is abandoned, and the value v is produced as the result of the application of call-with-escape-continuation.

So we don’t have to write a slightly-changed version of tactic-r. We can use this mechanism to escape from the recursion once we have reached a hole and computed the value we want to produce. The program uses tactic-r to define a number of useful utility functions.

We’re now ready to look at the tactics of the hole calculus. McBride provides many transformations, including some for rearranging or even dismantling partial constructions after a change of mind. Here I’ll stick to a small useful subset. We’re going to want to chain together two or more of these, with the possibility of failure partway through. In preparation for that, I will write internal tactics that consume a local proof state as a first argument, and produce a local proof state. These will be signalled with /i at the end of the function name. Some tactics will have user versions to be applied to the global proof state.

The first tactic is called claim, and it is a way of introducing a hole. In human-readable terms, given a type \(S\), it transforms \(?x:T~.~e\) into \(?y:S~.~?x:T~.~e\). McBride allows claim to work anywhere in an expression tree, but we don’t have a way for the user to indicate an arbitrary point, nor will they need to.

Here’s the code for claim/i. tactic-s consumes a local proof state and extracts information from it to provide to tactic-r. It also consumes a function doing the work in the hole specified by the local proof state, and that function consumes the context built up on the way to the current goal, as well as the goal’s Hole struct. Since we’re adding a new hole binder on top of that hole, we have to use rename to adjust the indices of the old hole.

; claim/i : PfSt Type -> PfSt
; claims t with ?nh : t . ?goal
 
(define (claim/i ps t)
  (tactic-s ps
    (lambda (ctx hole)
      (unless (Hole? hole) (tactic-error "current goal is not a hole"))
      (type-check ctx t (Type))
      (define nh (use-hole-ctr))
      (Hole nh t (rename add1 hole)))))

Here’s the user tactic for claim. Ultimately, we’re going to get rid of this, as we don’t want the user to have to learn OLEG. But it will be useful for now.

; claim : sexp -> void
; user tactic, claims se (creates new hole)
 
(define (claim se)
  (unless current-proof (tactic-error "no current proof in progress"))
  (define t (parse-in-hole se))
  (set! current-proof (claim/i current-proof t)))

The other internal and user tactic functions resemble these in many ways, so I won’t show code for most of them. Remember that the complete program is linked in the Handouts chapter and available for inspection.

The next tactic is called fill (McBride calls it try). It converts a hole \(?x:T~.~e\) into a guess \(?x\approx t:T~.~e\), provided that \(t\) typechecks as having type \(T\).

When can a guess \(?x\approx t:T~.~e\) turn into a certainty? When \(t\) is pure, that it, it contains no hole or guess bindings. McBride’s solve tactic replaces a pure guess with \(\mathrm{let}~ x = t ~ \mathrm{in}~ e\), but Brady does one step of reduction to obtain \(e[t/x]\). We’ll go with Brady’s approach, as it makes examples easier to read and is in line with our intention of hiding most of OLEG from the user, but we have added let, so we could do different things in different situations if we wanted.

The combination of fill and solve is a common one, and I will call this give, after the Agda Emacs command. In general, we might want to run several internal tactics in a row, and we might want to recover gracefully from failures instead of just throwing an error as we do now. Graceful recovery would allow more sophisticated tactics to try various things until one suceeds. It’s not hard to write the logic for this, but it gets tedious and unreadable for longer chains. Haskell provides the general mechanism of monads, which can be used for this and many other things. Monads work best in statically-typed languages; it’s hard to use them to full effect in Racket. Alexis King, a Racket contributor with much Haskell experience, has contributed a set of threading macros, again provided in a library module imported at the start of the program. Here’s the internal tactic give/i, which uses the threading macro ~>.

(define (give/i s e)
  (~> s
      (fill/i e)
      (solve/i)))

The threading macro consumes the initial state and then a sequence of applications, each of which has its first argument (the state from the earlier part of the chain) removed (because the macro will supply it in the actual code produced). I have not used the variant on the threading macro which allows for graceful recovery, but it is there in the library if we choose to use it.

The function make-def will let us convert a current proof which is pure into a global definition of the kind we already had earlier. We’re overdue for an example, but it wouldn’t be very interesting, because all we can do is make holes and guesses and complete them with full terms! We don’t have any way of creating partial terms yet. The next set of tactics addresses this.

We’ll start with introducing a lambda. The i-lam/i tactic, given a name \(n\) and a hole of the form \(?x:W~.~x\), where \(W\) reduces to a function type \(\forall y:S~.~T\), replaces the hole with \(\la n . ?x:T[n/y].x\). Of course, we are using de Bruijn indices, so \(n\) is just a name hint, and we don’t have to change variable names in \(T\). The intros tactic, named after the Coq tactic, takes a list of name hints and repeatedly applies i-lam/i.

The hole has to have the special form that we initialize new holes to in new-def. If the hole had a nontrivial body expression, what would we do with it? There’s no reason that the scope of the new binder \(n\) should coincide with the scope of the hole \(?x\). Instead, we start with a new scope that is as small as possible, and then we can continue to refine it (notice that the body of the lambda is a hole of the required form). This requirement will hold for the other binding forms (for-all, let, and others we might introduce). I’ll let you read the code for those.

This requirement is not as onerous as it might sound at first, because we can arrange it if a hole has a nontrivial body expression. The tactic that McBride calls attack works on a hole of the form \(?x:T~.~e\), and replaces it with \(?x\approx (?y:T~.~y):T~.~e\). It is basically a specialized fill.

Finally, we have application of a function, as done by the Coq tactic apply or by typing the name of the function into a hole in Agda and issuing the refine command. Our first try will do somewhat less than these implementations do, and we’ll see why we need to do more.

If we have a hole \(?y:T.e\) (it does not have to be in the special form) and we wish to do the equivalent of typing \(f~?\) into the hole, then \(f\) had better have a function type with result type \(T\) (say \(\forall x:S.T\)). Then we will convert the hole into \(?x:S.\mathrm{let}~y=f~x~\mathrm{in}~e\) (or do one step of reduction on the let, as we did earlier, resulting in \(?x:S.e[f~x / y]\)).

This can be done with three of our internal tactics. First, we claim \(S\), transforming \(?y:T.e\) into \(?x:S.?y:T.e\). Then we fill with \(f~x\), resulting in \(?x:S.?y\approx f~x :T.e\). Finally, we solve, producing the desired result (a let, or its reduced version).

The function i-app/i implements this idea directly. Now we’re ready for some examples! All the examples are available in the program. We’ll start by making the type of the polymorphic identity function. The easiest way is to simply type:

(def 'id-type '( (A : Type) -> (A -> A)))

But to illustrate interaction, let’s do this using our user tactics. We start by issuing the command (new-def 'Type). As written, this does the necessary initialization, but prints nothing. We can see what is set up by using (print-expr), which responds:

Expr: (?0 : Type => 0)

Type: Type

The command (print-goal) will respond with:

Goal: ?0 : Type

Context:

We will need to invoke these two often, and we could write a user tactic which does both, or have it done automatically by tactics that make changes.

We introduce the forall with (i-forall 'A 'Type). After that, (print-expr) shows us:

Expr: (∀ (A : Type) -> (?0 : Type => 0))

Type: Type

and (print-goal) shows us:

Goal: ?0 : Type

Context:

A : Type

Next we have an implication with left-hand type \(A\). We write this as a for-all with a don’t-care name, (i-forall '_ 'A). Once again, we could provide a user tactic to do this. (print-expr) and (print-goal) give us:

Expr: (∀ (A : Type) -> (A -> (?0 : Type => 0)))

Type: Type

 

Goal: ?0 : Type

Context:

_ : A

A : Type

Now we complete the expression with (give 'A). (print-expr) shows:

Expr: (∀ (A : Type) -> (A -> A))

Type: Type

The command (make-def 'id-type) will print:

id-type is:

Expr: (∀ (A : Type) -> (A -> A))

Type: Type

No current proof, use new-def to start one

We have basic interaction back, but this example didn’t really use information from the context to guide our hole-filling. The next example does a bit of that. Let’s build the polymorphic identity function itself.

Initializing with (new-def 'id-type) sets us up as expected.

Expr: (?0 : id-type => 0)

Type: id-type

 

Goal: ?0 : id-type

Context:

Looking at the meaning of id-type above, we see it has two arguments, so (i-lams '(A a)) makes sense. That results in the following situation.

Expr: (λ A => (λ a => (?0 : A => 0)))

Type: id-type

 

Goal: ?0 : A

Context:

a : A

A : Type

And now it is clear what to do: (give 'a).

Expr: (λ A => (λ a => a))

Type: id-type

(make-def 'id) sets up the global definition.

id is:

Expr: (λ A => (λ a => a))

Type: id-type

No current proof, use new-def to start one

We might want to see id-type expanded, and it wouldn’t be difficult to write user tactics that show a reduced goal type, or a context with all types reduced, as are available in Agda and Coq.

Next example. Here’s the statement of the transitivity of implication.

(def 'trans-type '( (A : Type) -> ( (B : Type) ->(  (C : Type) ->
                    ((B -> C) -> ((A -> B) -> (A -> C)))))))

We set things up with (new-def 'trans-type), and then we introduce the necessary lambdas with (i-lams '(A B C f g x)).

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ x => (?0 : C => 0)))))))

Type: trans-type

 

Goal: ?0 : C

Context:

x : A

g : (A -> B)

f : (B -> C)

C : Type

B : Type

A : Type

We need a C, and f produces one, so we should apply it, with (i-app 'f).

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ x => (?1 : B => (f 1))))))))

Type: trans-type

 

Goal: ?1 : B

Context:

x : A

g : (A -> B)

f : (B -> C)

C : Type

B : Type

A : Type

Now we need a B, and g produces one, so we (i-app 'g).

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ x => (?2 : A => (f (g 2)))))))))

Type: trans-type

 

Goal: ?2 : A

Context:

x : A

g : (A -> B)

f : (B -> C)

C : Type

B : Type

A : Type

Now (give 'x) completes our term.

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ x => (f (g x))))))))

Type: trans-type

And (make-def 'trans) installs the global definition.

trans is:

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ x => (f (g x))))))))

Type: trans-type

No current proof, use new-def to start one

With a tactic like i-app, we’re not restricted to just applying named functions from the context. We can apply an expression that has a function type. The next example illustrates this.

(def 'impex1-type '( (A : Type) -> ( (B : Type) ->(  (C : Type) ->
                     ((A -> B -> C) -> (A -> B) -> (A -> C))))))

We start with (new-def 'impex1-type) and then (i-lams '(A B C f g a)) to set up the lambdas.

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ a => (?0 : C => 0)))))))

Type: impex1-type

 

Goal: ?0 : C

Context:

a : A

g : (A -> B)

f : (A -> (B -> C))

C : Type

B : Type

A : Type

We don’t have a function in the context with result type C, but f applied to a does. (i-app '(f a)) gives us:

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ a => (?1 : B => ((f a) 1))))))))

Type: impex1-type

 

Goal: ?1 : B

Context:

a : A

g : (A -> B)

f : (A -> (B -> C))

C : Type

B : Type

A : Type

We can finish things off with (give '(g a)).

Expr: (λ A => (λ B => (λ C => (λ f => (λ g => (λ a => ((f a) (g a))))))))

Type: impex1-type

Things have been going well so far, though the examples are pretty simple. Proving the converse of the above theorem demonstrates some inadequacy in our current tools.

(def 'impex2-type '( (A : Type) -> ( (B : Type) ->(  (C : Type) ->
                     (((A -> B) -> (A -> C)) -> (A -> B -> C))))))

We start with (new-def 'impex2-type) and then introduce all the lambdas we can with (i-lams '(A B C f a b)).

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b => (?0 : C => 0)))))))

Type: impex2-type

 

Goal: ?0 : C

Context:

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

Now what? To get at the C, we need to apply f, but we don’t quite have what we need to apply it to. We can construct that, with (i-app '(f (λ x => b))).

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b => (?1 : A => ((f (λ x => b)) 1))))))))

Type: impex2-type

 

Goal: ?1 : A

Context:

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

And now (give 'a) will complete the term. But that application of f isn’t as incremental as we might like. The only way to do it more slowly is to use i-let to construct pieces, as we would with Proust for predicate logic without holes.

(i-let 'g '(A -> B) '(λ x => b))
(i-let 'h '(A -> C) '(f g))
(i-app 'h)

The issue is that we really want to consider f as a function of two arguments, and supply holes for both those arguments. But i-app only thinks of whatever we give it as a function of one argument. McBride calls this function "naive-reduce". (Brady skips this level to jump to a solution we’ll discuss in the last section of this chapter.) We’ll make it less naive by adding holes until we no longer have an arrow type.

(define (refine/i s f t dt)
  (cond
    [(equiv? t dt) (give/i s f)]
    [else
       (match t
         [(Arrow n d r)
            (~> s
                (claim/i d)
                (refine/i (App (rename add1 f) (Var 0)) r (rename add1 dt)))]
         [else (tactic-error "cannot refine")])]))

There is another subtlety regarding de Bruijn indices here. The desired type dt is going under the new binder introduced by claim, so it must be renamed, but r is being removed from one binder and placed under another, so its indices are correct.

We can now revisit the previous task and do a somewhat better job of interaction. As before, we start with (new-def 'impex2-type) and (i-lams '(A B C f a b)).

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b => (?0 : C => 0)))))))

Type: impex2-type

 

Goal: ?0 : C

Context:

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

But now (refine 'f) gets us:

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

         (?1 : (A -> B) => (?2 : A => ((f 1) 2)))))))))

Type: impex2-type

No goal is in focus, so we have to (focus 1). Hole 1 has a nontrivial body, so we must (attack). That gets us:

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

         (?1 : (A -> B) ≈ (?3 : (A -> B) => 3) => (?2 : A => ((f 1) 2)))))))))

Type: impex2-type

 

Goal: ?3 : (A -> B)

Context:

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

Now we can construct the argument to f incrementally. (i-lams '_) gets us:

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

         (?1 : (A -> B) ≈ (λ _ => (?3 : B => 3)) => (?2 : A => ((f 1) 2)))))))))

Type: impex2-type

> (print-goal)

Goal: ?3 : B

Context:

_ : A

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

We fill in hole 3 with (give 'b):

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

         (?1 : (A -> B) ≈ (λ _ => b) => (?2 : A => ((f 1) 2)))))))))

Type: impex2-type

Hole 1 is now pure, so (focus 1) and (solve) gets rid of it.

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b => (?2 : A => ((f (λ _ => b)) 2))))))))

Type: impex2-type

Finally, we complete hole 2 with (focus 2) and (give 'a).

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b => ((f (λ _ => b)) a)))))))

Type: impex2-type

This is an improvement, but it’s still far from a usable system. The need to refocus after a hole is filled in is annoying. More troublesome is the need for the user to know details about OLEG. They have to attack to create guesses, and eventually solve to get rid of them. In the next section, we’ll work on these issues.

5.3 Hiding the calculus

The code that results from the improvements in this section is linked on the Handouts page as proust-pred-forall-db-holes-tactics.rkt.

To reduce the need to refocus, we’ll replace the current goal in the proof state with a list of goals, current goal at the front. When the current goal is removed, focus shifts to the next most recent goal added, which is usually the correct choice.

There are a number of small obvious changes that have to be made. The slightly less obvious one is in the solve/i tactic. In addition to removing the now-solved current goal, it checks to see if the new current goal is now pure, and if so, solves that as well.

To avoid the need to attack, we build it into the user tactics which introduce binding forms, such as the one introducing lambdas, which is now renamed intros after the Coq tactic of the same name which does this.

(define (intros nhl)
  (unless current-proof (tactic-error "no current proof in progress"))
  (unless (or (symbol? nhl) (list? nhl) (not (empty? nhl)))
    (tactic-error "argument is a symbol or nonempty list of names"))
  (set! current-proof
        (if (symbol? nhl)
            (~> current-proof (attack/i) (i-lam/i nhl))
            (~> current-proof (attack/i) (i-lams/i nhl)))))

These improvements make the final example from the last section more natural to handle.

(def 'impex2-type '( (A : Type) -> ( (B : Type) ->(  (C : Type) ->
                     (((A -> B) -> (A -> C)) -> (A -> B -> C))))))

We start with (new-def 'impex2-type) and the renamed (intros '(A B C f a b)).

Expr: (?0 : impex2-type ≈ (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

                             (?1 : C => 1))))))) => 0)

Type: impex2-type

 

Goal: ?1 : C

Context:

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

(refine 'f) gives us:

Expr: (?0 : impex2-type ≈ (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

                             (?2 : (A -> B) => (?3 : A => ((f 2) 3))))))))) => 0)

Type: impex2-type

 

Goal: ?3 : A

Context:

?2 : (A -> B)

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

We finish off goal 3 with (refine 'a). This changes focus to goal 2.

Expr: (?0 : impex2-type ≈ (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

                             (?2 : (A -> B) => ((f 2) a)))))))) => 0)

Type: impex2-type

 

Goal: ?2 : (A -> B)

Context:

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

Next, (intros '_):

Expr: (?0 : impex2-type ≈ (λ A => (λ B => (λ C => (λ f => (λ a => (λ b =>

                             (?2 : (A -> B) ≈ (λ _ => (?4 : B => 4)) => ((f 2) a)))))))) => 0)

Type: impex2-type

 

Goal: ?4 : B

Context:

_ : A

b : B

a : A

f : ((A -> B) -> (A -> C))

C : Type

B : Type

A : Type

We finish with (refine 'b), which not only solves goal 4, but, since goal 2 is then pure, that as well. Proust prints the message "No goals remaining".

Expr: (λ A => (λ B => (λ C => (λ f => (λ a => (λ b => ((f (λ _ => b)) a)))))))

Type: impex2-type

That’s better. The user can still see guesses, but there is less need to look at the full expression, and you can probably think of ways of displaying the information that aren’t confusing to the user who doesn’t know OLEG.

The refine tactic still isn’t as nice as Agda’s. Here’s an example of something simple it can’t do. Suppose we want to construct (id Type id-type). This isn’t useful, but it stands in for more complex expressions that are useful. This has type Type, so we can start with (new-def 'Type). We can (give '(id Type id-type)), but what if we want to do this incrementally? When we try (refine 'id), it fails!

What went wrong? id has type (∀ (A : Type) -> (A -> A)). The refine tactic will construct (id ?0 ?1). Hole 0 will have type Type, and hole 1 will have type ?0, which is also the type of the expression (id ?0 ?1). But this expression needs to have type Type, the type of the goal. These are not the same, so refine will reject the suggestion.

But we know that the way to make the two types ?0 and Type the same is to fill hole 0 with Type. Once this is done, hole 1 will have type Type, and we can finish it off with (refine id-type). The task of computing a substitution for variables (metavariables in our case) which makes two types the same is called unification, and it is the subject of the next section.

5.4 Unification

The problem of first-order unification is defined on a set of equations over terms constructed from a set of uninterpreted function symbols \(f,g,\ldots\) (each with a specific number of arguments) and a set of variables \(x,y,\ldots\). A substitution is a mapping from variables to terms, and a unification problem is solvable if there exists a substitution such that applying it to the left- and right-hand sides of any equation makes the two sides structurally (syntactically) equal. As an example, the problem \(\{f~x=f~y, ~g~y=g~(f~z)\}\) has the solution \(\{x\mapsto f~z, ~y\mapsto f~z\}\). If we replace the one occurrence of \(z\) in an equation with \(x\), then there is no solution.

We can turn what we would more or less try in our heads into an algorithm. Remove an equation from the set. If it is of the form \(x=x\), discard it and continue with the rest of the set. If it is of the form \(x=t\), where \(x\) does not occur in \(t\), then substitute \(t\) for \(x\) in all other equations, and continue with the rest of the set. If it is of the form \(f~s_1\ldots s_m = g~t_1\ldots t_n\), but \(f \ne g\) or \(m \ne n\), then there is no solution. Otherwise add the equations \(s_i = t_i\) for \(i=1,2,\ldots,m\) to the set, and continue. When the set becomes empty, the problem is solved.

In practice, one might want to record the substitution along the way, for use afterwards. This problem is quite useful in computations involving logic and type theory, and a lot of work has been done on more efficient implementations. Even the simplest version of this algorithm derives the most general unifier, that is, any other solution can be derived from the substitution found with additional substitutions for some or all of the remaining variables.

In our context, we are trying to find a substitution that will make two types equal, in a particular context. So we start with a single equation. Our set of equations will consist of triples (context, left side, right side). The "uninterpreted function symbols" will be the constructors of our AST. The "variables" in a unification problem will be our metavariables, that is, references to holes. We won’t attempt to deal with references to guesses.

In the \(x=t\) case, we have to check that every variable referenced in \(t\) is in scope of the hole to which \(x\) refers. This is a stronger requirement than the "occurrence" check above. We’ll do the substitution right away (as in the basic description above), making use of our more general "parallel substitution" capability. Since we are moving among contexts, we have to take a bit of care to treat de Bruijn indices correctly. After substituting into a term, we reduce it.

Here’s the main unification function. You can find it and its helper functions in the complete program, linked as proust-pred-forall-db-holes-unif.rkt on the Handouts page.

(define (unify s f eqs)
  (match eqs
    ['() (list s f)] ; success!
    [(cons (list _ (Var n) (Var m)) reqs) #:when (equal? n m) (unify s f reqs)]
    [(or (cons (list ctx (Var n) t) reqs)
         (cons (list ctx t (Var n)) reqs)) #:when (in-scope-of-hole? s ctx n t)
      (define hn (first (list-ref ctx n))) ; hole number of (Var n)
      (~> s
          (focus/i hn)
          (give/i (rename (lambda (x) (- x n 1)) t)) ; "lift" t to level of hole hn
          (unify (psubst (make-vtmap n t) f) (subst-eqs hn t eqs)))] ; apply substs to f and eqs
    ; otherwise possibilities for t1, t2 for which heads match, recurse on pieces
    ; types are reduced, so no Def, Let, Ann; only Arrow, Lam, App, Type left
    [(cons (list ctx (Arrow x a b) (Arrow _ c d)) reqs)
         (define newctx (cons (list x a) ctx))
         (unify s f (cons (list ctx a c) (cons (list newctx b d) reqs)))]
    [(cons (list ctx (Lam x b) (Lam _ d)) reqs)
         (define newctx (cons (list x #f) ctx))
         (unify s f (cons (list newctx b d) reqs))]
    [(cons (list ctx (App f a) (App g b)) reqs)
         (unify s f (cons (list ctx f g) (cons (list ctx a b) reqs)))]
    [(cons (list ctx (Type) (Type)) reqs) (unify s f reqs)]
    [failure #f]))

We use unify in the internal refine/i tactic, where it replaces the equivalence check that is done first. This has the effect of trying zero arguments, then one, then two, and so on, rather than immediately going to the full number of holes suggested by a function type. This is the behaviour of Agda’s refine command, and is useful if one wants to only partially apply a function or theorem.

; refine/i : PfSt Expr Type Type -> PfSt
; f is expression to be tried, t its current type, dt desired type
; attempts to fill with f, f ?, f ? ?, and so on
 
(define (refine/i s f t dt)
  (define ctx (get-context s))
  (cond ; unification success produces new state, new f
    [(unify s f (list (list ctx t dt))) => (lambda (p) (give/i (first p) (second p)))]
    [else ; unification failed
       (match t
         [(Arrow n d r)
            (~> s
                (claim/i d)
                (refine/i (App (rename add1 f) (Var 0)) r (rename add1 dt)))]
         [else (tactic-error "cannot refine")])]))

The problematic example of the last section is no longer an issue. We tried to construct (id Type id-type) starting with (refine 'id), and it didn’t work. Now, with unification in place, it does.

(new-def 'Type)
(refine 'id)
(refine 'id-type)

First-order unification was used in Idris 1. The language, like Agda, allows users to type holes during refinement or more generally into the program itself, which is checked on reloading. As a result, there isn’t just one pair of types to unify, but multiple pairs. It is then possible to fail to unify one pair, only to later get information that allows unification to succeed on that pair. The order of attempted unifications can make a difference. It’s not at all clear how to do this well.

Once one knows about unification, possible uses for it start popping up all over the place. Both Agda and Coq allow implicit arguments. In Agda, one can write id _ id-type and the compiler will infer, using unification, that the underscore should be Type. One can write the contract for id as {A : Set} -> A -> A (Set is what Agda calls Type) and then one doesn’t even need the underscore; one can then use it this way: id id-type. Agda and Coq also allow the user to omit type annotations, and attempt to infer the types of unannotated variables. Once again, unification is used for this. Why not unify every time we have to check whether two types are equivalent?

For some of these uses, it turns out to be more convenient to consider higher-order unification. This extends the problem space of first-order unification by allowing metavariables to appear in the function position of a term, whereas before only uninterpreted function symbols could appear there. It really opens things up when we consider solving for metavariables with lambdas. The number of arguments on the left and right sides of an equation need no longer be the same. For example, the solution for \(f~t = t\) could be \(\la x.x\). But it could also be \(\la x.t\) (for fresh \(x\)). We’re not going to be able to get that "most general unifier" property, and it’s not clear what the correct choice is.

There is an even bigger problem with higher-order unification: it is undecidable, as shown by Gérard Huet in 1973. As with other problems for which there is no algorithm (or no efficient algorithm), one can try heuristics which may fail, or one can use algorithms guaranteed to succeed in special cases.

One success story in the second category is Dale Miller’s "pattern fragment" (1991). This specifies conditions under which the solution to an equation like \(f~x~y = t\) (where \(f\) is a metavariable and \(x\) and \(y\) are user variables) is \(\la x~y~.~t\). This is a small extension to the first-order unification algorithm, but it works for the kind of typechecking issues that arise when dealing with pattern matching.

In the first category, heuristics which may fail, the first big success story was Huet’s 1975 algorithm, which became the basis for Coq, and influenced much subsequent work. Agda 2 uses an algorithm originally developed by Ulf Norell and Caterina Coquand in 2007, and improved upon by later authors. Jesper Cockx’s 2017 PhD thesis gives a good overview of the uses of unification in proof assistants, as well as contributing some state-of-the-art recent additions to Agda.

Our version of Proust with user tactics and unification is by no means complete. It’s more of a sketch than the versions from the chapters on propositional and predicate logic, which is why there aren’t any exercises in this chapter. But I hope it has given you some idea of the kinds of things that can be done to improve the usability of proof assistants, and provided an introduction to current research and development.