7.6

List abbreviations and pattern matching

This supplementary material explains useful Racket features for Logic and Computation Intertwined.

The following Racket expression builds a list using cons.

(cons 2
  (cons 3
    (cons 4 empty)))

The resulting list value can be expressed more simply as (list 2 3 4).

An even more concise notation available is quote notation. Using it, we can express the above value as '(2 3 4). A simple way of thinking about quote notation is that an open parenthesis starts a list which is ended by the corresponding closed parenthesis. The characters in between are processed recursively in the same fashion. Numbers (and strings) quote to themselves. There is notation for expressing other Racket values which you can look up in the Racket documentation.

For a more complicated example, consider this nested list:

(cons 2
  (cons 3 (cons 1 empty))
    (cons 4 empty))

This can be expressed as (list 2 (list 3 1) 4) or as '(2 (3 1) 4).

If something that looks like an identifier or variable is used inside quote notation, it becomes a symbol. So another way of expressing '(* 3 4) is (list '* 3 4).

Another way of expressing '(* (+ 1 2) (+ 3 4)) is:

(list '*
      (list '+ 1 2)
      (list '+ 3 4))

Putting a quote in front of a Racket expression turns it into data that can be processed by a Racket function. As a modest example of how this might be useful, consider the following function, which interprets or evaluates quoted values that resemble Racket arithmetic expressions using +, *, and numbers.

(define (interp exp)
  (cond
    [(number? exp) exp]
    [(symbol=? (first exp) '*)
       (* (interp (second exp))
          (interp (third exp)))]
    [(symbol=? (first exp) '+)
       (+ (interp (second exp))
          (interp (third exp)))]))
 
(interp '(* (+ 1 2) (+ 3 4)))
=>* 21

In '(1 x 2), the mention of x becomes the symbol 'x. But what if there were a variable x whose value we wished to include in the list? We would need to say (list 1 x 2). Is there a way to combine the advantages of quote and list?

Quasiquote (usually abbreviated with an open single quote, as in `(1 2 3), works like quote, but it allows "escapes" to embed arbitrary Racket expressions. If such escapes are not used, it behaves like quote.

(define x 3)
 
`(a b x)
=> '(a b x)

Escapes within a quasiquote expression are handled by unquote (abbreviated with a comma):

(define x 3)
 
`(a b ,x)
=> '(a b 3)

Here we have used the Racket expression x, which in the given context will evaluate to 3. But instead of just escaping to an expression that is a single variable, we can use an arbitrary Racket expression.

(define x 3)
(define y '(c d))
 
`(a b ,(+ 2 x))
=>* '(a b 5)
 
`(a b ,y)
=> '(a b (c d))

At times we want to evaluate a Racket expression with a list value, but not just put the list value into the quasiquoted expression. Instead, we want to "splice" the computed list elements in. This is accomplished with unquote-splicing (abbreviated with a comma and at-sign):

(define x '(c d))
 
`(a b ,@x)
=> '(a b c d)

The pattern-matching special form match is available in full Racket, and it is convenient for many purposes.

The match special form resembles cond, but there is an initial expression which is reduced to a value, and the questions are replaced by patterns against which the value is matched. A pattern can bind variables for use in the corresponding answers. A use of match, then, specifies a value to be matched against and a series of pattern-answer pairs.

(match exp
  [pattern answer]
  ...)

Patterns have a recursive definition. An identifier by itself is a pattern that matches any value; that value is bound to that identifier. In the code below, the pattern y matches the value of x and is bound to that value.

(define (f x)
  (match x
    [y (+ 3 y)]))
 
(f 2) => 5
(f 5) => 8

Numbers and strings are patterns, matching their values.

(define (f x)
  (match x
    [3 "three"]
    ["four" 4]
    [y false]))
 
(f 3) => "three"
(f "four") => 4
(f 5) => false

We can form a compound pattern by applying list to a sequence of several patterns. In the example below, the pattern (list x y) matches a two-element list value, and binds x to the first element of the list and y to the second element of the list. Notice the local rebinding of x. In the answer (+ x y), x does not refer to the argument of f, but to the first element of that argument if it is a two-element list.

(define (f x)
  (match x
    [(list x y) (+ x y)]
    [(list x y rst ...)
       (list (- x y) rst)]
    [(list 1) false]))
 
(f '(1 2))   => 3
(f '(1 2 3 4)) => '(-1 (3 4))
(f '(1))     => false

The above example also shows how the ellipsis (...) can be used with an identifier to bind that to the rest of the list, allowing matching against lists of arbitrary length.

Identifiers can be repeated in a pattern, indicating a repetition of a value in whatever matches. In the example below, the list '(1 3 2) would not match any of the patterns, and a run-time error would result if g was applied to it.

(define (g x)
  (match x
    [(list x y x) (+ x y)]
    [(list (list x y) z) (- x z)]
    ['(1 (2 3)) true]))
 
(g '(1 3 1))   => 4
(g '((4 8) 3)) => 1
(g '(1 (2 3))) => true

In patterns, cons can be used as well as list to construct patterns matching list values. The underscore ("_") can be used as a pattern to indicate the presence of a value that need not be given a name. In a similar fashion, structures can be matched.

(struct posn (x y))
(define (h x)
  (match x
    [(cons 1 (list y z)) (+ y z)]
    [(list p _) (sqr p)]
    [(posn p q)
       (list p q)]))
 
(h '(1 2 3))        => 5
(h '(2 3))          => 4
(h (posn 7 8)) => (list 7 8)

We saw above that while quote (abbreviated as single close quote) can be used to construct list values, quasiquote (abbreviated as single open quote) also permits this, with the added advantage that within a quasiquoted expression, unquote (abbreviated with a comma) escapes to regular Racket expressions.

Quasiquote plays a similar role in patterns, except that here, unquote escapes to patterns.

(define (h x)
  (match x
    [`(1 ,y ,z) (+ y z)]
    [`(,(posn p q) ,r)
        (list p q r)]
    [`(,p ,_) (sqr p)]))
(h '(1 2 3)) => 5
(h (list (posn 7 8) 9))
  => '(7 8 9)
(h ('3 4)) => 9

We will find pattern matching very useful in writing readable code. As an example, consider the small interpreter for arithmetic expressions developed above.

(define (interp exp)
  (cond
    [(number? exp) exp]
    [(symbol=? (first exp) '*)
       (* (interp (second exp))
          (interp (third exp)))]
    [(symbol=? (first exp) '+)
       (+ (interp (second exp))
          (interp (third exp)))]))

Here is a version of it using match.

(define (interp exp)
  (match exp
    [(? number? n) n]
    [`(* ,x ,y) (* (interp x) (interp y))]
    [`(+ ,x ,y) (+ (interp x) (interp y))]))

The first pattern, using ?, applies the predicate number? to the value being matched, and if it produces true, matches the pattern n against it; the pattern n matches anything, and binds n to the result for use in the answer. The point here is that the version of interp using match is made more readable by pattern-matching using quasiquote patterns.

There are many other features of match, and some useful variations of it, described in the Racket documentation. Warning: the documentation is not written for beginners, and can be hard to understand.