This page gives a brief description of the history of the development of
BitterSuite.
The code for the latest versions of all previous versions of
BitterSuite are available
in the directories /u/isg/bittersuite{,2,3} and the ISG subversion repository.
The initial motivation for
BitterSuite was the state of runTests and computeMarks scripts
that were written for various courses. They often did things that were slightly incorrect
(for example, occassionally generating 50-page printouts for students who generated output
in an infinite loop), or hacks in the code meant that variables were not self-documenting
and features that were added were frequently lost. The code was developed gradually
during 2007.
In CS 134, there was a simple directory hierarchy set up; each marking suite had an
in directory, and each subdirectory of that contained a single test, consisting of the
files Test.java, .value and .desc. The goal for
BitterSuite was to generalize this:
the hierarchy could be nested arbitrarily deep, and instead of having a .value file containing
"1" in every test directory, a .value file could be specified at a particular directory
level and all subdirectories would inherit this. Over the development process, some features
were added on top of this (some of which were added and subsequently lost in previous
terms in various courses), including the ability to format marking schemes with nroff,
the ability to include autotesting marks directly on the marksheet with shell evaluation,
and the ability to divide tests by question.
Over time the number of dot-file directives grew as the suite came to have better support
for both Scheme and Java. However, it had some major problems. One was that a single language
for testing had to be specified at the start of autotesting, as the language-specific entry
point would then make calls to the base code to tell it how to test that particular language.
However, mixed-language assignments started being designed in courses like CS 136, and this
could not be supported easily. Also, as the code base was layered gradually on top of the
CS 134 framework by tutors relatively unfamiliar with Scheme, it was not implemented in
the cleanest fashion.
BitterSuite2 was a complete rewrite of the BitterSuite1 code. It endeavoured to keep
much of the same functionality (unfortunately, from an outdated version of the code and without knowledge of the documentation that had been written; these
situations are part of the motivation for the use of this Wiki). This rewritten code
allowed automatic switching of languages (detected by the extension of test files; Scheme,
C, Python, and shell were possibilities, but Java support was dropped). It also combined all of the various
dot-files into more convenient options.ss/options.scm files, which would contain key-value
pairs specifying any relevant information for the current level of the testing
hierarchy.
The main issue with the second version became apparent in Winter 2009. In the process
of supporting language switching, the base code became heavily dependent on the details
of every supported language. In that term, there were three testing development issues
that needed adressing: Scheme testing needed to be moved from the PLT 3XX standard to
version 4, Python testing needed to be developed, and C testing needed to have some
improvements added. In order to do this, the entirety of the code had to be transferred
around and significant changes to the base code had to be merged.
The goals for this are outlined in the
BitterSuite3DesignProposal.
This version takes advantage of Scheme namespaces to allow the base code to understand
how to handle a handful of basic options, but leave most of the details to each individual
language (specified in a separate Scheme module). It also includes the course account in the language search path, meaning that
a tutor can create a language definition on the course account and test it without
needing to touch the centralized code. Helpful custom languages can be propagated
via the Wiki, and bug fixes to the supported languages (or additions of new languages
that may end up needing to be supported) can be merged after very thorough testing.
Another advantage of the language independence is that each language can provide
a number of very language-specific options without polluting the options shared
by all languages.
At this point in time,
BitterSuite 3 is being prepared to become the default used in
CS 1([13][56]|45) in the Fall 2009 term.
Currently, tests are run on Solaris machines in a testing account. There are two concerns
with this:
- While the testing account provides a number of security restrictions, there are some interesting things clever students could do to potentially compromise some of the tests.
- The environment is likely to be radically different than the student's home environment.
A solution to this would be to wrap the testing portion in a virtual machine, using software that allows automated interaction with the virtual machine from our testing scripts. This machine
could be predesigned to have exactly the required set of software for testing and no more
(minimizing the number of things students would be able to do); it could then be booted,
have all required testing files copied to it, be given a program to run to perform all the
testing, have all of the testing files copied off of it to be processed back on the Solaris
(or, in the future, student.cs Linux) systems, and then shut down without saving any changes
to the file system's state so it's isolated from other students. This provides a higher
degree of security; as well, we could provide virtual machines to students with the same
environment (versions of programs, simulated architecture, etc.) so their home testing works
the same as the school system testing. This principle works in both directions; it also makes
it simpler for us to test in the same version of the program as the one we're instructing them
to use, which has proved unfortunately somewhat difficult on the Solaris systems.
One problem does not change: ensuring the set of software is correct. For Xhier systems, this
involves ensuring CSCF has installed an appropriate version in /software, and then explicitly
changing appropriate (hopefully isolated) portions of code to ensure that it is referring to the correct path.
For the virtual machine, somebody must still ensure that the single version
of each piece of software installed is the correct one, but it removes the path concerns, and
simplifies testing changes in advance of future terms as the code can simply make use of a newer
testing version of the virtual machine (instead of managing a list of ever-changing paths).
There are a number of virtual machines to consider, and they should be analyzed carefully to see which meet the requirements listed above. One possibility may be
VirtualBox.
See
ISGTechnicalDirection for more discussion.
*NB*: Should this be built into
RST's
runTests
hook (optional launch of virtual machine) so it's more generally available as an option, rather than building into
BitterSuite?