This walkthrough will go through the creation of a simple marking suite to illustrate the creation of a Scheme suite using BitterSuite, version 3. For your particular course, substitute course in directory paths with the appropriate value (e.g., cs135).
As of Spring 2010, the BitterSuiteInitialSetup should be done for any course using BitterSuite; this walkthrough should then be written to assume this setup by default (with pointers otherwise).
The reason for the particulars of how marking suites are constructed is due to the behaviour of RST. More details on the behaviour of RST is at ISGScriptsManPages#RST. While BitterSuite is designed in a fashion that would allow it to run independently of RST, RST provides a number of safety features to help ensure basic protection is in place for your program and handles cycling through submissions automatically, so in practice it is always run through RST.
All marking suites are contained inside of the directory /u/csXXX/marking
. Let's say we want to create a marking suite for the assignment “fake.”. We would then create a subdirectory with this name: /u/csXXX/marking/fake/
. Various testing suites for this assignment can then be created as subdirectories of this. Let's create one called 0
, so the full path to this suite will be /u/csXXX/marking/fake/0/
.
If your .rstrc
has been configured as described at BitterSuiteInitialSetup, this testing suite should automatically use the default BitterSuite hooks. Otherwise, you'll need to need to create explicit runTests
and computeMarks
scripts as described at that page. You also need to create a config.ss
file, as described at BitterSuiteConfig.
You can now try running this testing suite to see results:
rst fake 0
Chances are this will give you an error message because this assignment does not exist (unless it has already been created). You'll want to create the directory /u/course/handin/fake
and then populate that with directories representing a fake student or two.
You will also likely see RST complain about a missing answers
directory. This is done to make sure that if you're doing I/O tests you've prepared them properly. See BitterSuiteIOWalkthrough for details; otherwise, create an empty folder /u/csXXX/marking/fake/answers/
to suppress this warning message.
Once you've done this, you will instead see a series of errors from BitterSuite because of other files that are missing. However, you'll be able to verify that rst
is running runTests
and computeMarks
as expected.
Note that you may also see permissions errors on the marking directory if an advanced user configured RST not to overwrite marking directory permissions with friendly defaults; if this is the case, see RSTConfiguration for more information.
Jonathan Templin has created a python script which automates the creation of this marking directory. The script has only been tested by CS 115 and CS 116 tutors, but it should be useable by any course. For more information and to download the script, see AutomatedAutotestCreation.
Naturally, at some point you will also want to test some student code. The testing hierarchy is contained within a directory in
inside of /u/course/marking/fake/0
.
Inside of in
we will create a directory for each test. For the purposes of creating a test, we will assume this assignment required students to submit a file addition.scm
which contains a function add4 which consumes a single number and produces a new number which is 4 larger than the consumed number.
The testing suite needs to know a number of pieces of information, which are provided via an options.ss
file. Each option is specified in a key-values s-expression. The ordering of these expressions matters; they are processesd in order. In particular, there are keys that are understood by particular languages but not by the base code, and having the base code attempt to interpret these will cause the suite to abort.
So, we'll create a test directory called 1
(full path /u/course/marking/fake/0/in/1
), and put the following information inside of a file called options.ss
inside of that directory:
(language scheme/intermediate) (loadcode "addition.scm") (value 4) (desc "Adding to zero") (result (add4 0)) (expected 4)
This first line, (language scheme/intermediate)
, specifies the language that should be used to run the test. For Scheme, there are a number of different teaching language dialects; for this example, we're using Intermediate Student. Valid language values are those listed as teaching language codes in the sandbox.ss documentation in the DrScheme Help Desk, as well as scheme/module.
After this, we've specified what file the code to be tested is in with (loadcode "addition.scm")
. This code will be loaded into a sandbox testing environment where it is run in the specified language.
Next, we have decided to give the number of points this test is worth; any expression which evaluates to a numerical value will work. The default is 1, so this is not necessary if the test will just be worth a single mark.
Finally, after all of that boilerplate, we'll specify the test itself. Information about tests generally consists of three Scheme expressions representing the following:
(add4 0)
and 4
will be compared using the equal?
function by default. If this returns true, the test will be marked as passed; if not, it will be marked as failed.
Now you can run the rst
command specified earlier again. You should see much more meaningful results now. Try playing around with different fake student submissions in /u/course/handin/fake
to see different output. For example, one student may correctly write code: (define (add4 x) (+ 4 x))
, another may specify it incorrectly: (define (add4 x) (+ 3 x))
, another may misspell the function: (define (ad4) (+ 4 x))
, and a fourth student may submit something that isn't even valid Scheme: (define (hopeless
.
Note that this is a required part of creating tests. You must verify that
To create multiple tests, you can create new subdirectories of in
and create new options.ss
files in those subdirectories in a fashion similar to that given above. Note that the names of these subdirectories are slightly more restricted than general directory names; you should stick to alphanumeric characters, ignoring all symbols including the hyphen and underscore because they will conflict with the behaviour of the suite scripts.
This is enough for you to be able to write a testing suite for a given Scheme assignment. However, creating extra options files in each directory when they'll have very similar if not the same content is mind-numbing, as with the repeated language
and loadcode
expressions you'll be typing. Worse, creating a new evaluator for every test will slow your rst runs to a crawl. There is a way to avoid this.
There are two key features which vastly improve the usability of this testing suite:
options.ss
files are inherited from parent directories as you descend through the directory hierarchy.
language
and loadcode
expressions could have been specified once in an options.ss
file in the root in
directory and not specified in any of the test subdirectories; the behaviour would have been the same. Normally the language option in particular is present at the top directory and no subdirectories since a given assignment is often entirely in a single language.
Moreover, these can be overridden at deeper directories. For example, say that tests 1 through 5 are supposed to have value 1, but test 6 should be worth 2. Then the first five directories do not need a value
specified in the options file since they will inherit the value of 1 from the parent directory, but the sixth test's directory can override this this with its own value
option containing 2.
Since directory hierarchy can be arbitrarily deep, questions can also be grouped by value, although this will be reflected in the autotesting output for the students. For example, a hierarchy could be constructed as follows:
See SchemeRationalNumbers for warnings about non-integer values.
Before any tests are released, you should take a look at BitterSuiteMarkSchemeWalkthrough to make sure you are taking proper advantage of the potential of marking schemes.
If you are going to be working with tests using input and output, see BitterSuiteIOWalkthrough.
And, you should also read the language-particular twiki pages listed below:
These pages are designed as reference pages; walkthroughs in the style of this page should be added for these languages at some point in the future.