On this page:
3.1 Syntax and informal semantics
3.2 Examples in several languages
3.3 Formal semantics
3.4 Proving imperative programs correct
8.1

3 SIMP: a simple imperative language

\(\newcommand{\la}{\lambda} \newcommand{\Ga}{\Gamma} \newcommand{\ra}{\rightarrow} \newcommand{\Ra}{\Rightarrow} \newcommand{\La}{\Leftarrow} \newcommand{\tl}{\triangleleft} \newcommand{\ir}[3]{\displaystyle\frac{#2}{#3}~{\textstyle #1}}\)

This chapter discusses an artificial imperative language that has enough features to be able to illustrate concepts and difficulties that arise, while being simple enough to facilitate precise specification and implementation in the manner we are used to from Part I.

3.1 Syntax and informal semantics

What are the elements of imperative programming as we’ve seen so far?

We will use S-expression syntax to avoid parsing issues. This merely makes explicit the tree structure that is inherent in most programming languages. We would otherwise have to write a function that consumed a string representing a program and produce the required structure. A thorough treatment of techniques to accomplish this task would need a full book, course, or flânerie on its own (and it is not understood well by most computer scientists).

I’ll call the language SIMP, following the lead of Glynn Winskel in his book Formal Semantics of Programming Languages, in which he defined a similar language, with but with more conventional syntax. A SIMP program consists of declaration and initialization of some variables to integer values, followed by one or more statements. In the code below, the single statement uses print, whose single argument is a SIMP arithmetic expression or a string constant.

(vars [(x 0) (y 1) (z 2)]
  (print y))

Here are grammar rules for what we have so far.

  program = (vars [(id number) ...] stmt ...)
     
  stmt = (print aexp)
  | (print string)

We mutate variables with a set statement.

(set x (+ y z))

The sequencing construct is seq, which lets us combine several statements into one. There is an implicit seq in the vars statement.

(seq
  (set x 2)
  (set x (* x x))
  (set x (* x x)))

  stmt = (set id aexp)
  | (seq stmt ...)

The conditional construct is iif (for "imperative if", to emphasize that it does not produce a value). The test will be an expression that produces a Boolean value.

(iif (> x y)
     (set answer x)
     (set answer y))

  stmt = (iif bexp stmt stmt)

We can handle "one-sided if", in which one of the two sides does nothing, with the skip statement, which does exactly that. The repetition construct is while, which repeats a sequence of statements (again, there is an implicit seq) as long as a specified condition is true.

(while (> x 0)
  (set y (* 2 y))
  (set x (- x 1)))

  stmt = (skip)
  | (while bexp stmt ...)

We can add a modest collection of primitives for use in expressions (arithmetic and logical).

  aexp = (+ aexp aexp)
  | (* aexp aexp)
  | (- aexp aexp)
  | (div aexp aexp)
  | (mod aexp aexp)
  | number
  | id
     
  bexp = (= aexp aexp)
  | (> aexp aexp)
  | (< aexp aexp)
  | (>= aexp aexp)
  | (<= aexp aexp)
  | (not bexp)
  | (and bexp ...)
  | (or bexp ...)
  | true
  | false

We have several choices for implementation. The easiest one is to use Racket macros to implement the forms.

(define-syntax-rule
  (vars [(id init) ...] stmt ...)
    (let [(id init) ...]
      stmt ...))
 
(define-syntax-rule
  (iif test tstmt fstmt)
   (if test tstmt fstmt))

For while, we use a named let. This is a construct present in both Racket and Scheme. Adding a name after the let has the effect of binding the name to a function with as many arguments as there are binding clauses in the let. The body of the let becomes the body of this function, the function is initially applied to the values in the binding clause. This is a general description, but the use below is very simple, because there are no binding clauses, and the function thus has no arguments. We will see a more complicated use of named let shortly.

(define-syntax-rule
  (while test stmt ...)
    (let loop ()
      (when test
        stmt ... (loop))))

Exercise 7: Design and implement a construct for SIMP that resembles Racket’s for. Such a construct is often provided in imperative languages, though it can be simulated with while. \(\blacksquare\)

We can put all this in a file simp.rkt along with the following provide statement, which does the appropriate renaming of existing Racket features to implement the other SIMP forms.

(provide
  vars iif while
  (rename-out [display print][begin seq][set! set]
              [void skip][quotient div][modulo mod])
  > >= < <= = and or not true false + - *)

We can now create a file (say, power.rkt) containing a small SIMP program, and run it. (Sometimes the verb execute is used in place of "run" for imperative programs.)

#lang racket
 
(require "simp.rkt")
 
(vars [(x 10) (y 1)]
  (while (> x 0)
     (set y (* 2 y))
     (set x (- x 1)))
  (print y))

This lets us run SIMP programs. But it is too permissive; it also lets us program just in Racket, or mix Racket code and SIMP code. Racket has features that let us avoid this mixing, to keep us focussed on SIMP itself.

We change the provide expression in simp.rkt to export certain primitives.

(provide
  vars iif while
  (rename-out [display print][begin seq][set! set][void skip][modulo mod])
  > >= < <= = and or not  + - * /
  #%module-begin #%datum #%app #%top #%top-interaction)

The primitives handle, respectively, what is done with the text inside a module; what is done with Racket constants; what is done with function application; what is done with a top-level expressions; and what is done with a REPL expression. Here we are simply providing the built-in versions, though we could choose to provide alternates.

Then we can change power.rkt to look like this:

#lang s-exp "simp.rkt"
 
(vars [(x 10) (y 1)]
  (while (> x 0)
     (set y (* 2 y))
     (set x (- x 1)))
  (print y))

The first line says that S-expressions are used as the syntactic forms in this file (alternately, we could provide a reader capable of handling different syntax, such as in a typical imperative language without so many parentheses), and that the implementation of these forms is provided in simp.rkt.

We can now run this program, which can use only the bindings provided by simp.rkt (namely the core forms of SIMP).

This is enough to demonstrate our point, but if we were designing a proper language, we could go further. As things stand, our SIMP program could consist of several SIMP statements of various flavours, instead of just one top-level vars statement in a SIMP program. We can enforce this restriction by redefining #%module-begin, with the following changes:

(provide (rename-out [module-begin #%module-begin]))
 
(define-syntax module-begin
  (syntax-rules (vars)
    [(_ (vars defs stmt ...))
       (#%module-begin (vars defs stmt ...))]))

We could also add better error checking and reporting to SIMP by using the syntax-parse form instead of syntax-rules. For details, see the Racket documentation. Other things we might want to handle are enforcing that variables can be initialized only to integers, that Boolean operations or comparisons cannot be used in arithmetic expressions, and so on.

This is a small example of using Racket to design a domain-specific language (DSL) (for more examples, look for the paper by Matthew Flatt called "Creating Languages In Racket", or for the video of his presentation "Macros, Modules, and Languages in Racket"). Racket is particularly good at this, so if you have ambitions to design your own programming language, you should investigate further.

3.2 Examples in several languages

As a simple example of using the features of SIMP, consider the problem of computing all perfect numbers up to 10,000. A number is perfect if its divisors (including 1 but excluding itself) add up to it.

Here is a very naive but correct algorithm: Try each number i between 1 and 10,000. For each one, try all numbers j between 1 and i as divisors. Use an accumulator to hold the sum of all divisors seen so far. Print any number found to be perfect. This algorithm is naive because perfect numbers are quite rare; even ones must be of the form \(2^{p-1}(2^p-1)\), where \(2^p-1\) is prime, and if an odd one exists, it must be bigger than \(10^{1500}\). So the algorithm is doing far too much work testing numbers which cannot possibly be perfect. However, the code does illustrate some common patterns in imperative programming.

#lang s-exp "simp.rkt"
 
(vars [(i 1) (j 0) (acc 0)]
  (while (<= i 10000)
     (set j 1)
     (set acc 0)
     (while (< j i)
        (iif (= (mod i j) 0)
             (set acc (+ acc j))
             (skip))
        (set j (+ j 1)))
     (iif (= acc i)
          (seq
            (print i)
            (print "\n"))
          (skip))
     (set i (+ i 1))))

This program may look a little silly, but it is not far removed from a reasonable C program for the same algorithm.

-- C program

 

int main(void) {

    int i = 0;

    int j = 0;

    for (i=1; i <= 10000; i++) {

        int acc = 0;

        for (j=1; j < i; j++) {

            if (i % j == 0) {

                acc = acc + j;

            }

        }

        if (acc == i) {

            printf("%d is perfect\n", i);

        }

    }

}

We can translate this computation into several different versions in Racket. Here is a version using named let.

(let perfect ((i 1))
  (cond
    [(= i 10001) (void)]
    [(= i (let sum-div ((j 1) (acc 0))
            (cond
              [(= j i) acc]
              [(zero? (modulo i j))
                 (sum-div (add1 j) (+ acc j))]
              [else (sum-div (add1 j) acc)])))
       (printf "~a is perfect\n" i)
       (perfect (add1 i))]
    [else (perfect (add1 i))]))

Another general construct in standard Scheme is do. A do has a list of clauses setting up variables with initial bindings and update expressions, followed by a termination test, followed by a body expression. It resembles the more general for loop in C.

(do ((i 1 (add1 i)))
    ((> i 10000))
    (do ((j 1 (add1 j))
         (acc 0 (if (zero? (modulo i j))
                    (+ j acc)
                    acc)))
         ((>= j i)
          (when (= acc i)
            (printf "~a is perfect\n" i)))))

We have already discussed some of the Racket for extensions. for* nests the iteration of two or more sequences.

(for* ([i (in-range 1 10001)])
   (when (= i (for/fold ([sum 0])
                        ([j (in-range 1 i)])
                        (if (zero? (modulo i j))
                            (+ sum j)
                            sum)))
     (printf "~a is perfect\n" i)))

We can also do this in other languages. Haskell’s list comprehensions are similar enough to set notation that they are readable.

divisors :: Int -> [Int]

divisors i = [j | j<-[1..i-1], i `rem` j == 0]

main = print [i | i<-[1..10000], i == sum (divisors i)]

Python borrowed the idea of list comprehensions from Haskell, though the syntax is different.

print [i for i in range(1,10000) if i == sum([j for j in range(1,i) if i%j == 0])]

This is the shortest program we’ve seen so far, though we could have squeezed the Haskell program onto one line in a similar fashion. But how fast are each of these?

When I taught this material, I would run these from the command line on my MacBook Air with an Intel processor, using the time command in the shell. The Python program has the shortest code but the longest running time; the last time I ran it, it took 13 seconds. (The absolute numbers are not important; it’s their relative sizes that matter.) Python is an interpreted language, and there are no special optimizations for list comprehensions. They are "bolted" onto the language rather than being designed in from the start. It was possible to get this program down to about 7 seconds running time by using some other Python features, but Python was, at the time, not noted for speed.

Racket was also interpreted at the time (it now uses Chez Scheme as the back end, which is compiled to native code), but it used a JIT interpreter, which compiles small chunks of code to machine instructions on the fly. This made it faster. All the Racket examples above ran in a little under a second. (The SIMP program took about two seconds, since mutation is harder to optimize.)

Compiling the C program to native code, it ran in about a quarter of a second. But, surprisingly, so did the Haskell program. This is even more surprising because the Haskell compiler GHC (the Glasgow Haskell Compiler) used to compile Haskell programs to C. It now compiles directly to machine code, but when it used C, the same phenomenon was observed. Because Haskell is pure, GHC is capable of carrying out a considerable amount of optimization. In this case, it could detect that the lists were being processed sequentially, and did not create them in full. In effect, the C program produced by GHC from our Haskell program turns out to be very similar to our C program written by hand, even though the two programs look very different at first glance.

In the next chapter, we will consider even simpler imperative languages that expose even more of the machine, and you will get a sense of what native code might look like.

3.3 Formal semantics

Since we have used macros to rewrite SIMP programs as Racket programs, we can use the substitution model we have for Racket to define the meaning of a SIMP program. But just as we stripped away most of Racket in order to create SIMP, we should strip away the parts of the Racket substitution model that are are not relevant, and simplify what remains.

The main state elements of a SIMP program are the variables. (We will leave the print statement out of this discussion, as the handling of the output sequence is the same as for Racket, covered in the previous lecture module.) Rather than rewrite definitions, we will use mathematical notation. The state \(\sigma\) of a SIMP program will be a function or map from variables to their integer values. Since this function will change, we need a way to describe a modified version of a function. \([x\mapsto i]\sigma\) is the function that maps \(x\) to \(i\) and maps any other variable \(y\) to \(\sigma(y)\).

I will describe rewrite rules for pairs \((\pi,\sigma)\) of programs and state, with the understanding that we are always rewriting the leftmost, innermost redex. This understanding lets us put only the redex into the rewrite rule. For example:

((+ n1 n2) ,σ) => (n ,σ), where \(n=n_1+n_2\).

This is a rewrite rule that says that if the leftmost innermost redex is of the form (+ n1 n2), where the arguments are numbers, then the redex is replaced by the number which is their sum, and the state \(\sigma\) is unchanged. We are actually rewriting a pair \((\pi,\sigma)\) to \((\pi',\sigma)\), and (+ n1 n2) is the leftmost, innermost redex in \(\pi\).

There are a number of rewrite rules for arithmetic and boolean expressions, similar to the one in the previous paragraph, that do not change the state, and they will not be listed in detail here. The more interesting rules are the ones for statements. The rewrite rules for (skip) and (seq) (an empty sequence) also do not change the state, and simply erase the redex.

((set x n) ,σ) => (,[x↦n] σ)

((iif true s1 s2) ,σ) => (s1 ,σ)

((iif false s1 s2) ,σ) => (s2 ,σ)

((while t s1 ... si),σ)
=> ((iif t (seq s1 ... si (while t s1 ... si)) (skip)),σ)

Finally, we need a way of specifying the initial pair \((\pi_0,\sigma_0)\). If the SIMP program is (vars [(x1 n1) ... (xi ni)] s1 ... sj), then \(\pi_0\) is (seq s1 ... sj), and \(\sigma_0\) is the function such that for all \(1 \le k \le i\), \(\sigma(x_i)=n_i\).

One can easily write a stepper for SIMP (as described in Part I for Faux Racket) that implements the "small-step" substitution model given in the previous section. One can also write a more efficient interpreter which implements "big-step" semantics. The one-step function of the stepper would consume a (program, state) pair and produce a new one; the interpreter will consume a program and an initial state and produce the final state after the program has been run. As remarked on in Part I, it is desirable (and possible to prove) that the interpreter and the stepper give the same results, that is, that the small-step and big-step semantics coincide.

Exercise 8: Write a small-step interpreter for SIMP. \(\blacksquare\)

Exercise 9: Write a big-step interpreter for SIMP. \(\blacksquare\)

3.4 Proving imperative programs correct

This section should be considered optional reading. The point of covering it in detail is to illustrate how difficult reasoning about imperative code is, compared to reasoning about purely functional code. Since it is difficult, you should feel free to skim, or even skip to the next chapter.

My goal is to sketch a methodology for proving things about imperative code fragments. I’ll first review how proofs for functional code look, as briefly described in Part I.

Recall that the Fibonacci numbers are defined by the recurrence \(F_0=0\), \(F_1=1\), \(F_n = F_{n-1}+F_{n-2} ~(n>1)\). This translates easily into a Racket function.

(define (fib n)
  (cond
    [(zero? n) 0]
    [(= n 1) 1]
    [else (+ (fib (- n 1) (fib (- n 2))))]))

The proof that (fib n) equals \(F_n\) is an easy induction on \(n\), because the recursive code so directly mirrors the recursive definition. But this code is inefficient (it takes time exponential in \(n\)) and should never be used in practice. Instead, we use a helper function with two accumulators representing two consecutive Fibonacci numbers.

(define (fib-h n fn fnm1)
  (cond
    [(zero? n) fnm1]
    [(= n 1) fn]
    [else (fib-h (sub1 n)
                 (+ fn fnm1)
                 fn)]))
 
(define (fib n) (fib-h n 1 0))

This code has running time that is roughly proportional to \(n\) (assuming arithmetic operations take constant time, which is not really true, but that is not the point of this section). The correspondence between the code and the mathematical definition is less clear.

Proving (fib n) equals \(F_n\) seems to imply proving (fib-h n 1 0) equals \(F_n\). But this is not so straightforward, since if \(n>1\), (fib-h n 1 0) becomes (fib-h (sub1 n) 1 1). A trace of a typical computation suggests a more general hypothesis to prove.

(fib-h 5 1 0)
=> (fib-h 4 1 1)
=> (fib-h 3 2 1)
=> (fib-h 2 3 2)
=> (fib-h 1 5 3)
=> 5

We need to prove that for all \(j\), for all \(i>1\), if fjp1 is \(F_{j+1}\) and fj is \(F_{j}\), then (fib-h i fjp1 fj) equals \(F_{i+j}\). The proof of the stronger hypothesis is now a straightforward induction on \(i\).

To summarize the lessons learned in Part I, proofs of correctness for programs that use pure structural recursion are often straightforward using induction. But if the program uses structural recursion with an accumulator, coming up with the inductive hypothesis could take more work.

What would an efficient Fibonacci computation in SIMP look like? We don’t have functions in SIMP. But we can write a program that computes \(F_n\), given an initial value of n in a vars statement, using iteration in a manner similar to the tail-recursive fib-h function above.

Here is a first try at such a program.

(vars [(n 10) (fj 1) (fjm1 0) (ans 0)]
   (iif (= n 0)
        (set ans fjm1)
        (seq
         (while (> n 1)
            (set fjm1 fj)
            (set fj (+ fj fjm1))
            (set n (- n 1)))
         (set ans fj)))
   (print ans))

This program is unfortunately incorrect. The first set statement inside the while loop overwrites the value of fjm1 that is needed by the next statement.

Exercise 10: Without running the program either in Racket or in your head, form a hypothesis as to what this program actually computes, and then check your hypothesis. \(\blacksquare\)

We could try switching the two set statements around:

(vars [(n 10) (fj 1) (fjm1 0) (ans 0)]
   (iif (= n 0)
        (set ans fjm1)
        (seq
         (while (> n 1)
            (set fj (+ fj fjm1))
            (set fjm1 fj)
            (set n (- n 1)))
         (set ans fj)))
   (print ans))

But this is also incorrect.

Exercise 11: Repeat the previous exercise for this program. \(\blacksquare\)

We have two statements setting new values of two variables, but neither one can go first. The solution to the dilemma is to use a temporary variable t to hold an old value.

(vars [(n 10) (fj 1) (fjm1 0) (t 0) (ans 0)]
   (iif (= n 0)
        (set ans fjm1)
        (seq
         (while (> n 1)
            (set t fj)
            (set fj (+ fj fjm1))
            (set fjm1 t)
            (set n (- n 1)))
         (set ans fj)))
   (print ans))

We start to see some of the issues involved in imperative programming. In pure functional programming, where no values are mutated, this issue does not arise.

This program appears to do the right thing for various initial values of n. Our goal is to prove that this program prints the right Fibonacci number (\(F_n\)). Since we left printing out of our formal semantics, we will instead prove that the value of ans in the final state is \(F_n\). Whether a logical statement like \(\mathit{ans}=F_n\) is true or false depends on the state in which it is interpreted. We say such a statement is parameterized by the state.

The AST of a program is constructed out of smaller trees, and the big-step semantics (the interpreter) expresses the final state, given the initial state, in terms of recursive applications of the interpreter on program fragments. From the recursive interpreter, we can extract recursive rules for proving theorems involving parameterized logical statements. We use the idea of preconditions and postconditions from abstract data types. A precondition in this context is a logical statement interpreted in the initial state, and a postcondition is a logical statement interpreted in the final state.

In general, we are going to prove triples of the form:

{P}statement{Q}

where P and Q are parameterized logical statements. This should be read, "If precondition P is true in the initial state, and statement is executed, then postcondition Q will be true in the final state." More precisely, we are proving "For all states \(\sigma_i,\sigma_f\), \(P(\sigma_i)\) is true, and the executing of statement starting in state \(\sigma_i\) results in state \(\sigma_f\) , then \(Q(\sigma_f)\) is true."

Since the interpreter has a case for each kind of SIMP statement, we will examine each kind of SIMP statement in turn and see how the semantics dictate the triples that need to be proved. The plausible but incorrect Fibonacci programs above suggest that the rule for a while loop may be challenging to design and use.

As an example, to conclude

{P}(vars ([x1 v1] ... [xi vi]) stmt ...){Q}

it suffices to show

{P and x1=v1 and ... and xi=vi} (seq stmt ...) {Q}

Here the statement P might be saying something about relationships among the initial values v1 ... vi.

We will be reasoning informally about these rules but if we replace the triples in each one by the logical statements they represent (as defined precisely above), then it is not hard to prove each rule.

To conclude

{P}(seq stmt1 stmt2){Q}

it suffices to find a parameterized logical statement R for which we can prove

{P}stmt1{R}

and

{R}stmt2{Q}.

This easily generalizes to a seq containing more than two statements.

This requires finding an R that works for both parts of the proof, which may be difficult depending on the nature of the statements stmt1 and stmt2. In order to get things to match up nicely, we might need to strengthen a precondition and/or weaken a postcondition. This leads to the following rule.

To conclude

{P}stmt{Q}

we can prove

{P}stmt{Q}

and also P implies P, and Q implies Q. (More precisely, we need to show that for all states \(s\), \(P'(s)\) implies \(P(s)\), and \(Q(s)\) implies \(Q'(s)\).)

How can we conclude {P}(set x exp){Q}?

The postcondition Q of a set statement must be closely related to the precondition P, since it makes a small state change. But (set x exp) invalidates anything the precondition says about x, because it erases the previous value. Consequently, we cannot relate anything that P says about x to what Q says about it. However, we do know that the states before and after the execution of the set differ only in the value of x. So anything that Q says about other variables is true before the set is executed. Furthermore, if Q says something about x, then that must be true of exp before the set is executed. This leads to the following rule:

{Q[exp/x]}(set x exp){Q}

Here Q[exp/x] means Q with exp substituted for x.

This rule is a bit awkward to use because even though we think of the execution of a program as proceeding from top to bottom, the precondition is a modified version of the postcondition and not the other way around. This will have implications for our proof strategy, which we will see when we go back to our Fibonacci example.

How can we conclude {P}(iif B stmt1 stmt2){Q}?

B can be viewed as a parameterized logical statement, and from the code for the interpreter, we can read off that it suffices to prove the following two triples:

{P and B}stmt1{Q}

{P and not B}stmt2{Q}

The rule for (while B stmt ...) is the trickiest to work with. If the test B is false, the initial state and final states are the same. This suggests that our conclusion might be

{P}(while B stmt ...){P}

But we know that after the while loop terminates, B is false, and this will be true even if B is true in the initial state. So our conclusion is going to be

{P}(while B stmt ...){P and not B}

What do we need in order to conclude this? At the end of executing the body of the loop, we need P to be true, because that might be the last time the body is executed. If the body of the loop is executed, then we know that B is true. So what we need to prove in order to conclude the above is

{P and B}(seq s1 ... si){P}

The logical statement P is preserved by the body of the loop, and is often called a loop invariant. This seems paradoxical, because a loop is getting something done. One way to resolve the apparent paradox is to realize that the loop invariant expresses a relationship among the various state variables that holds even as the loop body changes their values. P need not hold partway through the sequence of statements that makes up the loop body, as long as it is restored by the end.

Of course, there are many statements preserved by the body of a loop. P could be true, or some statement about variables that are not even mentioned in the body of the loop and so do not change. But such statements are useless for our purposes. The point is to choose a loop invariant that gives us a useful postcondition P and not B, which lets us conclude that the loop has done the work that we designed it to do.

These are the rules of Hoare logic, defined by Tony Hoare (of Quicksort fame). They form the basis of formal methods for reasoning about imperative programs, even for languages more complicated than SIMP.

We are now ready to try to prove using the above rules, that the SIMP Fibonacci program works.

(vars [(n 10) (fj 1) (fjm1 0) (t 0) (ans 0)]
   (iif (= n 0)
        (set ans fjm1)
        (seq
         (while (> n 1)
            (set t fj)
            (set fj (+ fj fjm1))
            (set fjm1 t)
            (set n (- n 1)))
         (set ans fj)))
   (print ans))

We’ll prove a simpler program correct, eliminating the zero case and fixing a value of n.

(vars [(n 10) (fj 1) (fjm1 0) (t 0)]
   (while (not (= n 1))
      (set t fj)
      (set fj (+ fj fjm1))
      (set fjm1 t)
      (set n (- n 1))))
       ; fj = F(10)

The above postcondition fj = F(10) represents our goal. We can start with the fact that we have a vars statement, and so what we know initially are the initial values of the variables.

(vars [(n 10) (fj 1) (fjm1 0) (t 0)]
  ; n=10, fj=F(1), fjm1=F(0)
   (while (not (= n 1))
      (set t fj)
      (set fj (+ fj fjm1))
      (set fjm1 t)
      (set n (- n 1))))

In order to achieve our desired postcondition after the while, we need a loop invariant, and it has to be true when the loop commences. Coming up with this invariant is a black art, requiring understanding what the loop is trying to achieve and phrasing it in a form suitable for proof. The fragment below has the right invariant before the loop.

(vars [(n 10) (fj 1) (fjm1 0) (t 0)]
  ; n=10, fj=F(1), fjm1=F(0)
  ; fj=F(11-n),fjm1=F(10-n)
   (while (not (= n 1))
      (set t fj)
      (set fj (+ fj fjm1))
      (set fjm1 t)
      (set n (- n 1))))

If we can prove this loop invariant, we can conclude that it is true when the loop finishes, and the loop condition is not true (i.e., n = 1).

(vars [(n 10) (fj 1) (fjm1 0) (t 0)]
  ; n=10, fj=F(1), fjm1=F(0)
  ; fj=F(11-n),fjm1=F(10-n)
   (while (not (= n 1))
      (set t fj)
      (set fj (+ fj fjm1))
      (set fjm1 t)
      (set n (- n 1)))
 
  ; fj=F(11-n),fjm1=F(10-n), n=1
  ; fj = F10)

So this loop invariant lets us prove what we hope to prove about the whole program. Now our goal is to show that we have a valid loop invariant. That requires showing that the body of the loop preserves the invariant, as described below.

; fj=F(11-n),fjm1=F(10-n), not n=1
(seq
  (set t fj)
  (set fj (+ fj fjm1))
  (set fjm1 t)
  (set n (- n 1)))
; fj=F(11-n),fjm1=F(10-n)

Because the body of the seq consists of a series of set statements, and because the proof for each one involves a precondition that is a modification of the postcondition, the proof proceeds by pushing the postcondition for the seq back through the set statements, modifying it appropriately, and showing that the precondition for the seq is achieved.

We start with the final postcondition.

(seq
  (set t fj)
  (set fj (+ fj fjm1))
  (set fjm1 t)
  (set n (- n 1)))
; fj=F(11-n),fjm1=F(10-n)

Pushing the postcondition through the last set statement involves replacing n-1 with n everywhere in the postcondition.

(seq
  (set t fj)
  (set fj (+ fj fjm1))
  (set fjm1 t)
; fj=F(12-n), fjm1=F(11-n)      
  (set n (- n 1)))
; fj=F(11-n), fjm1=F(10-n)

We continue to push the logical statement back three more times, modifying it appropriately each time.

(seq
  (set t fj)
  (set fj (+ fj fjm1))
; fj=F(12-n), t=F(11-n)      
  (set fjm1 t)
; fj=F(12-n), fjm1=F(11-n)      
  (set n (- n 1)))
; fj=F(11-n), fjm1=F(10-n)

(seq
  (set t fj)
; fj+fjm1=F(12-n), t=F(11-n)      
  (set fj (+ fj fjm1))
; fj=F(12-n), t=F(11-n)      
  (set fjm1 t)
; fj=F(12-n), fjm1=F(11-n)      
  (set n (- n 1)))
; fj=F(11-n), fjm1=F(10-n)

(seq
; fj+fjm1=F(12-n), fj=F(11-n)
  (set t fj)
; fj+fjm1=F(12-n), t=F(11-n)      
  (set fj (+ fj fjm1))
; fj=F(12-n), t=F(11-n)      
  (set fjm1 t)
; fj=F(12-n), fjm1=F(11-n)      
  (set n (- n 1)))
; fj=F(11-n), fjm1=F(10-n)

What we have arrived at does not look quite like the required precondition, but we can strengthen a precondition.

; fj=F(11-n), fjm1=F(10-n), not n=1
; fj+fjm1=F(12-n), fj=F(11-n)
(seq
; fj+fjm1=F(12-n), fj=F(11-n)
  (set t fj)
; fj+fjm1=F(12-n), t=F(11-n)      
  (set fj (+ fj fjm1))
; fj=F(12-n), t=F(11-n)      
  (set fjm1 t)
; fj=F(12-n), fjm1=F(11-n)      
  (set n (- n 1)))
; fj=F(11-n), fjm1=F(10-n)

That concludes what we need to do for the proof, though a proper proof would be structured better than this. Here is the fully annotated SIMP program.

(vars [(n 10) (fj 1) (fjm1 0) (t 0)]
  ; n=10,fj=F1,fjm1=F0
  ; fj=F(11-n),fjm1=F(10-n)
   (while (not (= n 1))
     (seq
      ; fj+fjm1=F(12-n),fj=F(11-n)
      (set t fj)
      ; fj+fjm1=F(12-n),t=F(11-n)      
      (set fj (+ fj fjm1))
      ; fj=F(12-n),t=F(11-n)      
      (set fjm1 t)
      ; fj=F(12-n),fjm1=F(11-n)      
      (set n (- n 1)))
      ; fj=F(11-n),fjm1=F(10-n)))
 
  ; fj=F(11-n),fjm1=F(10-n), n=1
  ; fj = F10

Let’s step back from this and see what caused it to be so complicated. The heart of the problem is that mutation destroys information. The new value of a mutated variable is constructed from values not mutated by a given assignment statement, plus possibly the old value of the mutated variable. In our case, we could easily compute the old value, but in other situations, one has to introduce "ghost variables" that don’t appear in the program, but are used in the proof to save values lost to mutation.

The mutation of the loop variable is simple and predictable in our case, but we still have a situation where the loop invariant is being destroyed and then systematically recreated for the next (future) value of the loop variable, before that loop variable is updated at the end of the loop. I spent some years early in my career teaching introductory material to students, and when they came to me with incorrect loops, I found that they could not explain what they wanted their loop to do, or even understand why their incorrect loop gave the results that it did. This kind of reasoning is difficult when the rules are explicit, as they are above. It is much harder when the rules are not spelled out, and a student is left to rely on "intuition".

We have seen that proving a simple imperative program correct is rather complicated. It gets even worse when arrays are involved, and nearly impossible when list/tree data structures are used. The issue is aliasing (for example, different expressions computing the same array index, or different pointer paths to the same node). If an aliased element is mutated via one of its "names", then that mutation is visible through the other names as well, further complicating the reasoning.

The difficulty of formally proving things about imperative programs translates into difficulty with informal reasoning about imperative programs. This is another reason to start learning to program using the functional paradigm. In the next chapter, we will continue the process of exploring the imperative paradigm, by discussing how the language might change if the program is stored in a data structure that more closely resembles real memory (namely, a vector instead of a tree or nested list).