1 Basics
1.1 Getting started
You can download the Racket installer from the Racket website (https://racket-lang.org/). Racket programs can be written in your favourite editor and run from the command line or from a script; some of these editors have modes that permit interaction as well. However, in this document, I will assume that you are using the DrRacket application from the Racket distribution.
When you start DrRacket for the first time, it displays a window divided horizontally into two text areas, Definitions and Interactions. Racket is actually a family of languages, and you need to tell DrRacket which one to use. Use the Choose Language menu item in the Languages menu to bring up the language-choice dialog, and click the first radio button, labelled "The Racket Language". This will put the following line into the top of the Definitions window:
#lang racket
DrRacket will remember your choice. The above line should remain the first line of the Racket programs you write while working with this document. I will not include it in subsequent examples.
Programs written in the Definitions window can be saved and recalled using standard file-save and file-open dialogs. Either of the Definitions and Interactions windows may be temporarily hidden (in which case the other expands to fill the space). DrRacket permits multiple tabs per window and multiple windows.
The Help menu will bring up locally-installed documentation in your Web browser of choice, including the overview provided by The Racket Guide and The Racket Reference, which is more comprehensive. The same material also appears on the Racket website, and words typeset in blue in this document (including within code fragments) link to relevant sections in the Racket Reference there.
1.2 Programs, expressions, and values
In imperative programming languages, a program typically computes by repeatedly modifying or mutating the values of variables. In contrast, functional programming languages focus more on the application of functions (also called procedures) to already-computed values in order to create new values. Mutation is present in Racket but is de-emphasized; we will discuss it and other side effects much later in this document, to show what can be accomplished in their absence.
A Racket program is a sequence of definitions and expressions. When one runs a program, each of the expressions is evaluated in order, and the resulting values are printed in the Interactions window, one to a line. (All Racket values have a printed representation, though, as we will see, in some cases only a limited amount of information is printed.) The Interactions window then repeatedly offers an interactive prompt (>) which lets one enter more expressions and have them evaluated. This read-evaluate-print loop (REPL) is a common and useful feature of interpreters for functional programming languages.
The simplest expressions are just values. Here are some examples of Racket values.
42 -1267650600228229401496703205376 22/7 3.14159
Racket supports exact representation of unbounded integers and rational numbers (as well as complex numbers, which I will not discuss in this document). The approximation to pi in the fourth line above is an inexact number, equivalent to a double-precision floating point number. Inexact numbers, besides being produced when an exact rational representation is impossible, are useful when exact computations keep too many digits; there are functions to convert between exact and inexact representations. Many programming language introductions start with a "Hello, world!" program, but an equivalent Racket program is not particularly illustrative. It consists of a single expression made from a single string constant:
"Hello, world!"
An expression can also be the application of a pre-defined function (one provided by the Racket language) to a number of argument expressions. We construct such an application with an open parenthesis, followed by the name of the function, followed by the argument expressions separated by spaces, followed by a close parenthesis. Here’s an example, typed after the prompt (>) in the Interactions window, with the result shown:
> (max 3 -2 4.1 1/3) 4.1
This syntax is used for all function applications. Racket does not use infix notation for common arithmetic operations. + is a function that takes an arbitrary number of numerical arguments.
> (* (+ 1 2 3) (- 4 5 6) (/ 7 8 9)) -49/12
As a result, parentheses are not used to specify order of operations. Do not add extra parentheses; they are likely to change the meaning of an expression.
1.3 Definitions
One form of definition binds a name (or identifier) to a value (perhaps specified by an expression); the name can then be used in subsequent expressions. In the absence of mutation, this binding cannot be changed.
Another form of definition creates user-defined functions. The name of the function is put in parentheses with the names of its parameters, and this is followed by a single body expression representing the computation. We apply a user-defined function the same way we apply a pre-defined function.
(define x 3)
(define y (+ x 1))
(define (sum-of-squares x y) (+ (square x) (square y)))
(define (square x) (* x x)) > (sum-of-squares x y) 25
Multiple body expressions are permitted in function definitions, but all but the last one are evaluated and the values discarded. In the absence of side effects, multiple body expressions are not useful.
Many (not all) imperative languages are statically typed, meaning the type of any expression in the program can be inferred. This often involves adding explicit type annotations (to variables, parameters, and functions). In contrast, Racket is dynamically typed. Each value computed at run time has a type, but in general, two applications of the same function (even at the same point in a program) may involve arguments of different types and may produce values of different types.
We see the beginnings of the notion of scope in this example. In the body expression of the definition of sum-of-squares, the name x is used, and its binding occurrence is the first parameter x of sum-of-squares. The scope of that parameter is the entire body expression. In contrast, the name x is also used in the definition of y, but its binding occurrence is the top-level definition (define x 3). The scope of that occurrence is the entire program after the definition, except where x is given a new, more local meaning (as in the parameters of sum-of-squares and square). We will return to this idea shortly.
Definitions and expressions may be interspersed in a Racket program. When an expression is evaluated, it may use any or all definitions appearing above it in the program. The body expression of a function f may use functions defined below, as long as those definitions precede the application of f in an expression to be evaluated.
1.4 Some notes on style
The uniform syntax of Racket has computational advantages which we shall soon see. But it takes some adjustment. The beginnings of stylistic convention are already evident in our examples. Racket identifiers are case-sensitive, but capital letters are rarely used. Hyphens (-) are preferred over underscores (_) in identifiers. Arguments are either placed on the same line or start in the same column. Multiple close parentheses are typically placed on the same line.
DrRacket will position the cursor at the right level of indentation when the Enter key is pressed. This positioning can be overridden with Space and Delete keys, and restored for a single line using the Tab key. There are menu items to reindent the current text selection and the entire program. If the cursor is before an open parenthesis or after a close parenthesis, the entire sub-expression will be highlighted, which avoids the need to count parentheses. There are many useful keyboard shortcuts for editing which you can explore (and alter) by using the Keybindings entry in the Edit menu.
A semicolon (;) signals the start of a comment, which persists for the rest of the line. Nestable multi-line comments start with #| and end with |#. Racket programs tend to contain many small functions which don’t require in-line comments. A short header describing purpose and arguments can be useful.
1.5 Conditional expressions
Equality of numbers can be tested by the = function, which consumes two or more numeric arguments and produces #t if they are all equal, #f otherwise. The more readable name true is bound to the Boolean value #t, and false is bound to the Boolean value #f.
> (= 2 2) #t
> (= 2 3) #f
> (= (sqrt 4) 2) #t
> (= 2 2.0) #t
> (= 2 2 2) #t
> (= 2 2 3) #f
> (= "Pynchon" "DeLillo") =: contract violation
expected: number?
given: "Pynchon"
argument position: 1st
other arguments...:
"DeLillo"
The more general equality-testing function equal? can be applied to any two Racket values. (Functions that produce Boolean values often have names that, by convention, end with a question mark.)
The function not maps #t to #f and vice-versa.
> (> 3 4) #f
> (< 3 4) #t
> (>= 2 2 3) #f
> (even? 42) #t
> (not (even? 43)) #t
> (exact? (sqrt 4)) #t
> (exact? (sqrt 5)) #f
> (boolean? (= 3 5)) #t
> (number? (+ 3 5)) #t
> (string? "hawk") #t
> (string<=? "hawk" "handsaw") #f
An if expression contains three subexpressions. To evaluate an if expression, the first ("test") subexpression is evaluated. If it is false, the third subexpression is evaluated and becomes the value of the if expression; otherwise the second subexpression is evaluated.
> (define x -2) > (if (> x (sqrt 5)) "greater" "not greater") "not greater"
> (if (positive? x) x (- x)) 2
Note that the test subexpression need not produce a Boolean value, though typically it does. if is not a function, but a form, as not all of its arguments are evaluated. A Racket if expression differs in several ways from an if statement in an imperative language: an if expression has a value, both the second and third subexpressions must be present, and if expressions are typically not nested, because the more general and more readable cond expression can be used instead.
A cond expression contains a number of question-answer pairs of expressions, each enclosed in square brackets. The question expressions are evaluated in order until one does not evaluate to false, at which point the corresponding answer expression is evaluated and the resulting value becomes the value of the cond expression. The last question can be else, in which case the corresponding answer is evaluated.
(define (abs x) (cond [(> x 0) x] [else (- x)]))
> (abs 5) 5
> (abs -3) 3
The use of square brackets here is purely for readability. A matched pair of parentheses in a Racket program can be replaced by square brackets without changing the program, and vice-versa. We will see a few other places where square brackets are used by convention.
More complex tests can be constructed using and and or. An and expression evaluates each of its arguments in order until it finds one that has value false, in which case the whole expression evaluates to false. But if no such argument is found, the expression evaluates to the value of the last argument. or works the same way, but produces the first non-false value, or false if there isn’t one.
(define (f x) (if (positive? x) x #f))
> (and (> 3 4) (< 3 4)) #f
> (and (f 3) (f 4)) 4
> (and (f -2) (f 3)) #f
> (or (f -2) (f 3) (f 4)) 3
1.6 Recursion
Repetition, or doing a series of similar computations, is accomplished in Racket by using recursion. A function is recursive if it is applied within its own body expression. The following function computes n!, or the product of all integers from 1 to n (where n is positive).
(define (fact n) (cond [(= n 1) 1] [else (* n (fact (- n 1)))]))
> (fact 10) 3628800
> (fact 50) 30414093201713378043612608166064768844377641568960512000000000000
The evaluation of (fact n) for large n requires space (memory) roughly proportional to n, because the computation of (fact n) requires work to be done after the recursive application (fact (- n 1)) (namely the multiplication by n). We can avoid this space overhead by using tail recursion (intuitively, ensuring that the recursive application is the "last thing done").
(define (fact-helper n acc) (cond [(= n 1) acc] [else (fact-helper (- n 1) (* n acc))])) (define (fact n) (fact-helper n 1))
> (fact 10) 3628800
Racket will evaluate an application of this version of fact in constant space (apart from the space required for acc, which is accumulating the result), regardless of the value of the argument. However, due to the importance of recursion in Racket, the space allocated for managing control (the stack) is larger than in many imperative languages, so it is not necessary to write everything in a tail-recursive fashion.
1.7 Compound data
Two values can be combined into one using cons.
The printed form of a cons value changes if it is a list.
A list is either empty
or (cons v lst), where v
is any Racket value and lst is a list. Lists may contain values
of different types.
> empty '()
> (cons 3 empty) '(3)
> (cons true (cons "blue" empty)) '(#t "blue")
> (cons (cons 1 (cons 2 empty)) (cons 3 (cons 4 empty))) '((1 2) 3 4)
A cons value can be deconstructed with the unfortunately-named functions car and cdr (these names come from the earliest days of Lisp). If the value is a list, the more readable first and rest can be used instead.
> (car (cons 3 4)) 3
> (cdr (cons 3 4)) 4
> (first (cons true (cons "blue" empty))) #t
> (rest (cons (cons 1 (cons 2 empty)) (cons 3 (cons 4 empty)))) '(3 4)
The quote notation used to display the value of lists can also be used in programs, but only to specify a list whose values are known at the time the program is written. The more general list function makes a list out of all of its arguments. Racket also provides functions second through tenth to access the first several elements of a list, as well as a number of useful utility functions.
> (list (+ 1 2) (+ 3 4)) '(3 7)
> (third '(1 2 3 4 5)) 3
> (append '(1 2) '(3 4 5)) '(1 2 3 4 5)
> (list-ref '(6 7 8 9 10) 3) 9
> (length '(6 7 8 9 10)) 5
> (take '(1 2 3 4 5) 3) '(1 2 3)
> (drop '(1 2 3 4 5) 3) '(4 5)
> (reverse '(1 2 3 4 5)) '(5 4 3 2 1)
The expression '(1 2) is actually shorthand for (quote (1 2)). When the quote is applied to a single identifier, the result is a symbol, a form of atomic data with a constant-time equality test. Examples of symbols include 'nabokov and 'lowry. Symbols are useful when human-readable distinct values are needed, for example as tags. To a first approximation, the result of quoting an expression that starts with an open parenthesis is that all subexpressions until the corresponding close parenthesis are quoted and the results put into a list. Numbers, strings, and values specified with a hash-sign (such as #t and #f) quote to themselves. This is an oversimplification, but will do for the purposes of learning. (For the full gory details, see The Reader in the Racket Reference.)
quote gives a concise way of constructing nested lists; using list is more verbose, but permits evaluation of expressions. quasiquote (abbreviated ‘) provides both advantages. It acts like quote, but a subexpression using unquote (abbreviated ,) escapes to regular evaluation.
> `(1 2 ,(length '(1 2 3)) 4 5) '(1 2 3 4 5)
The recursive definition of a list, with its two cases, leads naturally to recursive code for processing lists.
(define (sum-list lst) (cond [(empty? lst) 0] [else (+ (first lst) (sum-list (rest lst)))]))
> (sum-list '(1 2 3 4 5)) 15
(define (square-list lst) (cond [(empty? lst) empty] [else (cons (sqr (first lst)) (square-list (rest lst)))]))
> (square-list '(1 2 3 4 5)) '(1 4 9 16 25)
These functions can also be written in a tail-recursive fashion.
(define (sum-list-helper lst acc) (cond [(empty? lst) acc] [else (sum-list-helper (rest lst) (+ acc (first lst)))])) (define (sum-list lst) (sum-list-helper lst 0))
> (sum-list '(1 2 3 4 5)) 15
Strings are a form of compound data, being composed of characters. There are a few string manipulation functions available, and more can be loaded from libraries (discussed in the next section).
> (substring "weather" 1 4) "eat"
> (string-append "wee" "knights") "weeknights"
But a common paradigm is to convert a string to a list of characters in order to process it, since there are many more useful functions available to manipulate lists.
> (string->list "turnip") '(#\t #\u #\r #\n #\i #\p)
> (list->string (reverse (string->list "pots"))) "stop"
Since a list may contain other lists, it is easy to use a list to represent a labelled tree. The first element of the list is the label of the root, and the rest of the elements are the representations of the child subtrees of the root. However, for clarity and readability, it may be better to use a structure with named fields. Here is an example for a binary tree.
(struct bt (label left right) #:transparent) (define tree (bt "root" (bt "left child" empty empty) (bt "right child" empty empty)))
> (bt? tree) #t
> (bt-label tree) "root"
> (bt-left tree) (bt "left child" '() '())
In general, the struct special form is given the name of a structure and names for zero or more fields (in parentheses). This causes several functions to be defined: a constructor (the name of the structure), a type predicate (the name with a question mark appended), and field accessors (the name of the structure and the name of the field joined with a hyphen). The optional keyword #:transparent ensures that the printed representation of a structure shows the field values; the default is that structures are opaque, which facilitates information hiding and data abstraction.
Each use of cons or a structure constructor uses a small fixed amount of memory. Racket has garbage collection; memory no longer used is periodically reclaimed. Lists in Racket, unlike in Lisp and Scheme, are immutable. Structures are immutable by default but can declared as mutable or partially mutable (as discussed later).
Immutable data structures lead to a form of programming where an update produces a new structure, but the old version is still available for use. The space overhead is often less than expected because the old and new versions usually share information. More importantly, the lack of mutation allows for clarity of code and facilitates the maintenance of invariants. In the absence of side effects, any expression can be replaced by another that produces the same value, which allows equational reasoning as typically used in mathematics.
1.8 Modules
A Racket program may be distributed over multiple source files. Each file starting with a #lang line is a module. A module can export bindings to other modules using the provide form, and can import bindings from other modules using the require form.
The provide form must appear at the top level (not inside another expression) and contains a sequence of identifiers bound at the top level (usually later in the module, since by convention provide and require expressions are placed right after the #lang line). There are variations to concisely provide all identifiers defined in a module, to rename a binding, and to export all bindings associated with a given struct definition.
The require form contains a sequence of require specifications. A require specification can be a string, which is interpreted as a path to a file relative to the location of the source file. It can also be an unquoted identifier, in which case it refers to an installed library (a collection). For example, (require "my-helpers.rkt") refers to a file in the same directory, and (require net/url) loads bindings from a library that has useful code for dealing with the Web. Each of these will import all bindings that are exported (using provide) from the specified module. There are variations to only import certain bindings, to rename or prefix imported bindings to avoid clashes with identifiers in the importing module, and to exclude certain bindings from the imported set.
The top-level expressions in a module are evaluated and the resulting values printed in the Interactions window when that module is run from the Definitions window. If the module is required by means of a require expression that is evaluated, the expressions in the required module are evaluated, but their values are not printed.
Racket supports submodules, with a couple of useful special cases. If a module has a main submodule, and the module is run, the main submodule is run afterwards. This does not happen when the module is required. If a module has a test submodule, that submodule is not run when the module is run, but it can be run from the command line.
Modules, and the forms described above, are more general than described here, but this brief introduction covers the majority of uses. Racket documentation for a library module usually starts with the require expression needed in your code. Racket’s package mechanism provides access to additional libraries that may not be installed by default. For details, see Package Management in Racket.
1.9 Testing
You can do on-the-fly testing by typing expressions into DrRacket’s Interactions window. If these expressions are put into the program in the Definitions window, they will be evaluated each time the program is run. But then you have the problem of matching computed values to expected values. If these are paired up in expressions, using an equality predicate, you can look for occurrences of false in the evaluated results to find mismatches. String values can be used to label each test.
"factorial test" (= (factorial 5) 120) "integer-to-string test" (string=? (integer-to-string "32") "thirty-two")
The teaching languages provide a testing framework with a nice graphical user interface. We can access this framework (though not the GUI) by requiring the module test-engine/racket-gui. We can then use the imported check-expect form to rewrite the above tests.
(check-expect (factorial 5) 120) (check-expect (integer-to-string "32") "thirty-two")
A check-expect expression is not evaluated immediately (which allows such expressions to be part of documentation that precedes a function definition). Evaluation of all such expressions is triggered by the following expression:
(test)
This produces a detailed report for all failed tests (including line number, expected value, and computed value) in a separate window. If, instead, we require the module test-engine/racket-tests, the report is produced in the Interactions window (or written to standard output, if the command-line version of Racket is used).
The testing modules also provide check-error, which checks if its single argument expression produces an error when evaluated; check-within, used to test whether an inexact computed value is within a certain tolerance of an expected value; and check-member-of, which permits several possible expected values.
The testing modules are really intended for beginners. Proper unit tests are provided by the module rackunit which, when combined with submodules as described above, provide effective testing for large developments.