This walkthrough will go through the creation of a simple marking suite to illustrate the creation of a Scheme suite using
BitterSuite, version 3. For your particular course, substitute
course in directory paths with the appropriate value (e.g., cs135).
As of Spring 2010, the
BitterSuiteInitialSetup should be done for any course using
BitterSuite; this walkthrough should then be written to assume this setup by default (with pointers otherwise).
Creation of a marking directory
The reason for the particulars of how marking suites are constructed is due to the behaviour of
RST. More details on the behaviour of
RST is at
ISGScriptsManPages#RST. While
BitterSuite is designed in a fashion that would allow it to run independently of
RST,
RST provides a number of safety features to help ensure basic protection is in place for your program and handles cycling through submissions automatically, so in practice it is always run through
RST.
All marking suites are contained inside of the directory
/u/csXXX/marking
. Let's say we want to create a marking suite for the assignment “fake.”. We would then create a subdirectory with this name:
/u/csXXX/marking/fake/
. Various testing suites for this assignment can then be created as subdirectories of this. Let's create one called
0
, so the full path to this suite will be
/u/csXXX/marking/fake/0/
.
If your
.rstrc
has been configured as described at
BitterSuiteInitialSetup, this testing suite should automatically use the default
BitterSuite hooks. Otherwise, you'll need to need to create explicit
runTests
and
computeMarks
scripts as described at that page. You also need to create a
config.ss
file, as described at
BitterSuiteConfig.
You can now try running this testing suite to see results:
rst fake 0
Chances are this will give you an error message because this assignment does not exist (unless it has already been created). You'll want to create the directory
/u/course/handin/fake
and then populate that with directories representing a fake student or two.
You will also likely see
RST complain about a missing
answers
directory. This is done to make sure that if you're doing I/O tests you've prepared them properly. See
BitterSuiteIOWalkthrough for details; otherwise, create an empty folder
/u/csXXX/marking/fake/answers/
to suppress this warning message.
Once you've done this, you will instead see a series of errors from
BitterSuite because of other files that are missing. However, you'll be able to verify that
rst
is running
runTests
and
computeMarks
as expected.
Note that you may also see permissions errors on the marking directory if an advanced user configured
RST not to overwrite marking directory permissions with friendly defaults; if this is the case, see RSTConfiguration for more information.
Jonathan Templin has created a python script which automates the creation of this marking directory. The script has only been tested by CS 115 and CS 116 tutors, but it should be useable by any course. For more information and to download the script, see
AutomatedAutotestCreation.
A Single Scheme Test
Creating a test directory
Naturally, at some point you will also want to test some student code. The testing hierarchy is contained within a directory
in
inside of
/u/course/marking/fake/0
.
Inside of
in
we will create a directory for each test. For the purposes of creating a test, we will assume this assignment required students to submit a file
addition.scm
which contains a function
add4 which consumes a single number and produces a new number which is 4 larger than the consumed number.
Creating a test
The testing suite needs to know a number of pieces of information, which are provided via an
options.ss
file. Each option is specified in a key-values s-expression. The ordering of these expressions matters; they are processesd in order. In particular, there are keys that are understood by particular languages but not by the base code, and having the base code attempt to interpret these will cause the suite to abort.
So, we'll create a test directory called
1
(full path
/u/course/marking/fake/0/in/1
), and put the following information inside of a file called
options.ss
inside of that directory:
(language scheme/intermediate)
(loadcode "addition.scm")
(value 4)
(desc "Adding to zero")
(result (add4 0))
(expected 4)
This first line,
(language scheme/intermediate)
, specifies the language that should be used to run the test. For Scheme, there are a number of different teaching language dialects; for this example, we're using Intermediate Student. Valid language values are those listed as teaching language codes in the sandbox.ss documentation in the DrScheme Help Desk, as well as scheme/module.
After this, we've specified what file the code to be tested is in with
(loadcode "addition.scm")
. This code will be loaded into a sandbox testing environment where it is run in the specified language.
Next, we have decided to give the number of points this test is worth; any expression which evaluates to a numerical value will work. The default is 1, so this is not necessary if the test will just be worth a single mark.
Finally, after all of that boilerplate, we'll specify the test itself. Information about tests generally consists of three Scheme expressions representing the following:
- Description: A string describing what the test should do.
- Student evaluation: An expression which will test the code submitted by the student.
- Expected result: The value expected from the evaluation of the above expression.
The results of evaluating
(add4 0)
and
4
will be compared using the
equal?
function by default. If this returns true, the test will be marked as passed; if not, it will be marked as failed.
Testing the tests
Now you can run the
rst
command specified earlier again. You should see much more meaningful results now. Try playing around with different fake student submissions in
/u/course/handin/fake
to see different output. For example, one student may correctly write code:
(define (add4 x) (+ 4 x))
, another may specify it incorrectly:
(define (add4 x) (+ 3 x))
, another may misspell the function:
(define (ad4) (+ 4 x))
, and a fourth student may submit something that isn't even valid Scheme:
(define (hopeless
.
Note that this is a required part of creating tests. You must verify that
- The test suite you have designed does not contain any errors.
- Code that is known to be correct passes your tests
- Multiple examples of code that is known to be incorrect will fail the tests, and in particular, will fail in exactly the way that's expected and not in other ways.
The autotesting results will give you some indication of the behaviour of the suite under different circumstances. If the file cannot be found or the code can not be loaded properly, then there will be an appropriate failure report for the test, and execution of the test will not be attempted. If the test passes, the test description is printed along with an indication it passed. If the test fails, the test description is printed followed with a mention of failure, and both the results produced by the student code and the expected value are printed so the student can try to trace what went wrong. If running the test on the student code causes an exception to be raised, this is just a special case of the failure situation; the string representation of the exception will be used as the student's output.
Creating additional tests
To create multiple tests, you can create new subdirectories of
in
and create new
options.ss
files in those subdirectories in a fashion similar to that given above. Note that the names of these subdirectories are slightly more restricted than general directory names; you should stick to alphanumeric characters, ignoring all symbols including the hyphen and underscore because they will conflict with the behaviour of the suite scripts.
This is enough for you to be able to write a testing suite for a given Scheme assignment. However, creating extra options files in each directory when they'll have very similar if not the same content is mind-numbing, as with the repeated
language
and
loadcode
expressions you'll be typing. Worse, creating a new evaluator for every test will slow your rst runs to a crawl. There is a way to avoid this.
Hierarchical testing
There are two key features which vastly improve the usability of this testing suite:
- The values specified in
options.ss
files are inherited from parent directories as you descend through the directory hierarchy.
- Directories may be nested arbitrarily deep.
In this example above, the
language
and
loadcode
expressions could have been specified once in an
options.ss
file in the root
in
directory and not specified in any of the test subdirectories; the behaviour would have been the same. Normally the
language option in particular is present at the top directory and no subdirectories since a given assignment is often entirely in a single language.
Moreover, these can be overridden at deeper directories. For example, say that tests 1 through 5 are supposed to have value 1, but test 6 should be worth 2. Then the first five directories do not need a
value
specified in the options file since they will inherit the value of 1 from the parent directory, but the sixth test's directory can override this this with its own
value
option containing 2.
Since directory hierarchy can be arbitrarily deep, questions can also be grouped by value, although this will be reflected in the autotesting output for the students. For example, a hierarchy could be constructed as follows:
- in
- options.ss: (value 1) (language scheme/beginner) (loadcode abc.ss)
- simple
- complex
- options.ss: (value 3/2)
- 4
- 5
- 6
- options.ss: (value 2)
- test.ss
In this case, tests 1 though 3 are worth 1 mark, tests 4 and 5 are worth 3/2 and test 6 is worth 2, for a total of 8 marks.
See
SchemeRationalNumbers for warnings about non-integer values.
What to Read Next
Before any tests are released, you should take a look at
BitterSuiteMarkSchemeWalkthrough to make sure you are taking proper advantage of the potential of marking schemes.
If you are going to be working with tests using input and output, see
BitterSuiteIOWalkthrough.
And, you should also read the language-particular twiki pages listed below:
These pages are designed as reference pages; walkthroughs in the style of this page should be added for these languages at some point in the future.