3 Impure Functional Programming

3.1 Input and Output

The functions and forms covered in the previous section had as their sole purpose to produce a new value. But sometimes other work is done as well, typically referred to as a side effect. The simplest side effects to understand involve input and output (I/O).

We have seen that top-level expressions in a program are evaluated and their results printed in the Interactions window. A value can be printed during the computation, as a side effect, by using print in an expression (not necessarily at the top level).

> (print (* 3 4))
12
> (print "Chris Marker")
"Chris Marker"

Every expression, when evaluated, produces a value. The value of the two expressions above is #<void>. If this value is produced by a top-level expression, nothing is printed as a result. The value #<void> can also be produced by the function void, which consumes any number of values.

The begin form consumes several expressions, evaluates them in order, and produces the value of the last one. We have seen an example of this before: the body of a function definition can contain several expressions, which are treated in this fashion. A similar "implicit begin" occurs in many other Racket forms: for example, in the body of a lambda, in the body of let and other binding constructs, and in each answer of a cond.

An expression of the form (if t b (void)) can be written (when t b), and an expression of the form (if t (void) b) can be written (unless t b). The single expression b in these examples can be replaced by several expressions (another "implicit begin").

The functions write and display also print the value of their argument and produce #<void>, but with subtle differences from print. The write function is designed for data that will later be read in by a Racket program (using functions described below). The display function removes the hash-slash formatting from individual character values, and the double quotes from around string values. The expression (newline) is equivalent to (display "\n"), that is, it prints a newline character. (This example also shows that Racket string constants may use a backslash to embed special characters. For more details, see Reading Strings in the Racket Reference.)

A series of display expressions can be combined using begin to produce complex output, but it is often simpler to use formatted print, or printf. The first argument to printf is a format string, which may contain tilde escapes to indicate points at which the remaining arguments are embedded in the resulting output.

> (printf "The next two squares are ~a and ~a." (* 3 3) (* 4 4))
The next two squares are 9 and 16.

The format function is used in a manner similar to printf, but it does no output and instead produces as its result the string that would have been printed by printf. For more tilde escape options, see the documentation for fprintf in the Racket Reference.

Many I/O functions take an optional last argument, which is a port. A port can be a destination for writing, or an origin for reading. The obvious use for a port is to write to or read from files, but ports can also be used to communicate over networks, between processes, or even to use strings as internal data structures. Here we will focus on files.

The function open-output-file consumes one argument, a specification of the path to the file, and produces a port. The simplest way to specify a path is by using a string (with the filename in the same directory as the program being run, or a relative specification if it is in a different directory). Racket has a more general way of specifying paths that allows one to write platform-independent code; see Paths in the Racket Reference. There are keyword arguments that allow one to specify text or binary mode (the default; text mode is useful in handling Windows line-ending conventions), and to specify what to do if the file already exists.

The function close-output-port consumes a port and closes it, flushing any pending writes.

The port produced by open-output-file must be supplied to a writing function as an argument in order for the file to be written to. This can be avoided through use of the parameter procedure current-output-port. The parameter can be queried by applying current-output-port to no arguments, and changed by applying it to a new port value. This is an example of the general parameter mechanism, which is discussed below.

A common pattern of file interaction is implemented by call-with-output-file, which consumes a path specification and a procedure. It opens the file for writing, runs the procedure (which consumes one argument, the port created by opening the file, which the argument procedure uses for writing), and closes the file. with-output-to-file is similar, but the procedure which is its second argument consumes no arguments itself, as the port to the opened file is made the current output port by means of the parameter mechanism.

Racket also provides write-char to write a single character, and write-byte to write a single byte (write-char writes the UTF-8 encoding of Unicode, and so may write more than one byte).

Input is more difficult to fit into a mental model of computation, because, unlike output, it affects values produced by expressions. But from the point of view of syntax, it parallels output. open-input-file, close-input-port, current-input-port, call-with-input-file, and with-input-from-file operate in a manner analogous to their output counterparts.

read will read from the current input port if applied to no arguments, or from the port supplied as the argument. It will produce a value that, if written using write, will look like what was read. The following code (which deliberately uses display instead of write) will bind lst to '(a 1 b 2).

(with-output-to-file "myfile.txt"
  (lambda () (display "(a 1 b 2)")))
(define lst
  (with-input-from-file "myfile.txt"
    (lambda () (read))))

This provides a strong incentive for Racket programmers to store tree-structured data using nested lists, which avoids messy parsing issues. Further incentive is provided by the ease of specifying and manipulating nested lists using quote/quasiquote notation and pattern matching. Popular formats such as XML and JSON can be seen as more verbose forms of this representation.

For more conventionally-structured files, read-line will read a single line (an optional second argument can specify a line-termination convention). Lower-level input is provided by read-char, which will read a single UTF-8 character, and read-byte, which will read a single byte.

Racket has a number of additional I/O functions that handle more complicated situations, as detailed in the section Input and Output in the Racket Reference.

3.2 Mutation

A name-value binding can be changed or mutated by the set! form, which has as arguments a name (a variable in scope) and an expression that is evaluated to provide the new value to which the name should be bound. (Functions and forms that have side-effects involving mutation often have names that, by convention, end with an exclamation mark.)

> (define x 3)
> (define (change-x y) (set! x y))
> x
3
> (change-x 4)
> x
4

The simplest mutable data structure provided by Racket is the box. The box function consumes a value and produces a box containing the value. Since a box is itself a value, it can be used as an argument, stored in a data structure, and so on. The unbox function consumes a box and produces the value inside. The set-box! function consumes a box and a value, and mutates the box so that it contains the new value. A box can be used in situations where, in an imperative language, call-by-reference or a pointer might be used. This example provides a simple counter mechanism.

> (define (increment-box b)
(set-box! b (add1 (unbox b))))
> (define my-box (box 0))
> (unbox my-box)
0
> (increment-box my-box)
> (unbox my-box)
1

Lists constructed with cons are mutable in Lisp and Scheme, but not in Racket. This PLT blog post explains the reasons in depth. The section Mutable Pairs and Lists in the Racket Reference describes a distinct mutable list type.

User-defined structures can be made mutable by including the #:mutable keyword in the define-struct expression. This creates an additional function with a name like set-structname-fieldname!, which consumes a struct of the given type and a new value for the given field.

> (define-struct favourite (category entry) #:mutable)
> (define my-faves (list (favourite "band" "Talking Heads")))
> (set-favourite-entry! (first my-faves) "Mission of Burma")
> my-faves
(list (favourite "band" "Mission of Burma"))

A box can be thought of as a mutable structure with a single unnamed field. Conversely, a box can be used to make a field of any structure type mutable, albeit with an extra layer of indirection.

What are typically called arrays in other programming languages are called vectors in Lisp, Scheme, and Racket. A vector is a fixed-length sequence of values indexed by a non-negative integer between 0 (inclusive) and the length (exclusive) supporting access and update in constant time. A vector can be created by the vector function, which consumes an arbitrary number of arguments.

> (define vector-example1 (vector (* 0 0) (* 1 1) (* 2 2) (* 3 3)))
> vector-example1
'#(0 1 4 9)

Note the quote notation for vectors, which can be used in programs, though it creates immutable vectors. Quasiquote can also be used with vectors. Here is another way to build the above vector. (The function build-list serves a similar role for lists.)

> (define vector-example2 (build-vector 4 sqr))

The vector-ref function consumes a vector and an index and produces the vector entry thus indexed. The vector-set! function consumes a vector, an index, and a value, and mutates the vector entry thus indexed to the new value.

> (vector-ref vector-example2 2)
4
> (vector-set! vector-example2 2 (* 4 4))
> vector-example2
'#(0 1 16 9)

The functions list->vector and vector->list convert between vectors and lists.

A hash table implements a mapping from keys to values, such as in the "favourites" example above. Lookup and modification can be done in constant time. Hash tables come in mutable and immutable flavours (the immutable version is actually implemented using a purely-functional tree structure, and the "constant" is technically logarithmic in table size). Here we discuss the mutable version, but the immutable version has advantages similar to the other purely functional data structures discussed earlier.

> (define my-faves2 (make-hash))
> (hash-ref my-faves2 "band")
hash-ref: no value found for key
key: "band"
> (hash-ref my-faves2 "band" "oops")
"oops"
> (hash-set! my-faves2 "band" "Sleater-Kinney")
> (hash-ref my-faves2 "band")
"Sleater-Kinney"
> my-faves2
'#hash(("band" . "Sleater-Kinney"))

The quote notation illustrated above can be used in programs to create immutable hash tables. The function make-hash can take an optional argument which is a list of pairs (each being the cons of a key and a value) with which to initialize the table. The function hash->list performs the reverse conversion. Other useful pre-defined functions that involve hash tables are described in the section Hash Tables in the Racket Reference.

Functions such as current-input-file (described above), that control access and update to a stored value, are examples of parameter procedures. A parameter procedure can be created with make-parameter, which consumes a value and produces a parameter procedure that interacts with a parameter initialized to that value. The parameter value can be retrieved by applying the procedure to no arguments, or mutated by applying the parameter procedure to the new value. The parameterize special form resembles let, but its pairs consist of parameter procedures and values; those bindings are in effect for the duration of the evaluation of the body expression, providing a form of dynamic scope. Similar effects can be achieved with pairs of set! expressions, but parameters behave properly with non-standard control flow, such as might occur when threads or continuations (discussed later) are used.

← prev up next →

1	Basics
2	Pure Functional Programming
3	Impure Functional Programming
4	Advanced Racket