Automated Testing Suite Creation and Execution
A script was created in Summer 2010 to assist tutors in the creation of autotesting folders for the assignments. An additional script was created in Fall 2010 to assist with the creation of test files within those folders. These scripts were created for CS116, but should be usable in any course which uses Bittersuite for testing. Currently, it is set up for use with Scheme and Python, but it should be possible to add other languages, if desired.
The script for creating testing folders is called make-all.py, the script for creating the test files is called make-tests.py, and the script for creating the mark-scheme is called make-mark-scheme.py.
Step-by-Step Instructions (See below for more details if needed)
The following are step by step instructions of how to make public tests and autotests. More information is provided lower down on this page, or on the python specific page.
1. Test Folder Creation
This requires the use of the make-all.py script, which should be in the bin folder.
- In terminal, type make-all.py a## # (ex. make-all.py a01 4 for assignment 1 with 4 questions)
- Type in the language (beginner, intermediate, intermediate-lambda, advanced, or python)
- Need check-within? Type y if yes, or n if not. Go through the prompts and say yes/no for each question
A folder called a## will be created in the marking folder. If you go into this folder, test.1(autotest) and test.pt(public test) folders will have been created.
2. Public Test Creation
The public test folder will have an "in" folder, which will have the number of folders specified inside (one for each question). Do the following inside each of the folders:
- Go into a question folder. There should be an options.rkt folder inside.
- Create a new folder called "t01" (The number after "t" corresponds to the test number. If more public tests are needed for this question, do the following steps for another folder called "t02", "t03" and so on).
- Create a Racket/Python file called test.rkt or test.py in this folder (depending on if you are doing scheme or python). In this file:
Scheme:
desc = test description
(result (fn-call par1 par2...))
(expected exp-val)
Python:
desc = test description
result = fn-call(par1 par2...)
expected = exp-val
Ensure that these tests work by testing them on sample code:
- Go to handin->a## folder.
- Create a folder with your userid.
- Put different files for each question in the folder.
- In terminal, type in rst -s userid a## pt anyname. Terminal will run the public tests on the code inside the userid folder and show the results. Also, a folder should now appear in the marking->a## folder with the full testing results. Note that anyname can be ignored, and a timestamp will take its place.
3. Private Autotest Creation
This will make use of the make-tests.py script, which should be in the bin folder.
- Go into marking->a##->test.1 folder.
- Create a text file for each question of the assignment.
- In the text files, create tests in the same format as the public tests, leaving a blank line in between each test. Ensure there are no blank lines at the end of the file and only a single blank line between each test.
- For each text file you have created, type make-tests.py a## question# file.txt. Each test will now be created in the "in" folder in the separate question folders.
Ensure that these tests work by testing them on sample code:
- Go to handin->a## folder.
- Create a folder with your userid.
- Put different files for each question in the folder.
- In terminal, type in rst -s userid a## 1 anyname. Terminal will run the autotests on the code inside the userid folder and show the results. Also, a folder should now appear in the marking->a## folder with the full testing results. Note that anyname can be ignored, and a timestamp will take its place.
Different Types of Tests
If
check-within is needed/used, ensure fpequal.rkt module is in the provided folder in the test.pt and test.1 folders. If not, it can be found in the "Useful Modules" folder in the marking folder:
desc = description
(modules fpequal.rkt)
(equal check-close)
(result (fn-call par1 par2...))
(expected exp-val)
If
input is needed/used, ensure suppress_prompt.py module is in the provided folder in the test.pt and test.1 folders. If not, it can be found in the "Useful Modules" folder in the marking folder:
desc = description
input = {inp1
inp2
inp3...}
result = fn-call(par1 par2...)
expected = exp-val
If
printing is needed/used, ensure redirect_output.py module is in the provided folder in the test.pt and test.1 folders. If not, it can be found in the "Useful Modules" folder in the marking folder:
Note that the redirect_output.py module creates a list of strings of the printed output, where each string is a separate line of the output, so the exp-val should be a list of strings.
desc = description
from redirect_output import *
result = redirect_output( fn_name, [par1, par2...])
expected = exp-val
4. Running Autotests
These steps are based off of the instructions in
MarkUsPrepareForMarking#If_ISA_needs_to_upload_plain_tex. Do these steps any time after the due date. Replace
a01
with the current assignment number.
- In MarkUs, go to the assignment's Submissions tab. Click "Collect All Submissions" link and choose the option "Mark the files that were in the students' accounts at due date". Wait for all rows to turn green.
- Download the students' submissions to the course account by running this in the course account Terminal:
/u/isg/bin/markus.py download a01 /u/cs116/handin/a01_autotest
- Create link between the a## and a##_autotest folders in the marking folder. In terminal, cd into the marking folder and run
ln -s a01 a01_autotest
.
- If you want to check that the students did enough testing, run the scripts to do that now. See TestTests for Python and CheckTestcases for Racket.
- Run RST to check the students' code:
distrst -t t -s '*' a01_autotest 1 AUTOTESTRESULTS
- When your testing scripts are all done running, update the variables in MakeGradedAssignment.py, then run
python MakeGradedAssignment.py
in terminal. This puts all of the results into one neat file which will be used to grade on Markus.
- Upload the GRADED_ASSIGNMENT.py file to MarkUs by running:
/u/isg/bin/markus.py upload_marking --filter 'GRADED_ASSIGNMENT.py' a01 /u/cs116/handin/a01_autotest/
- Check in MarkUs that the GRADED_ASSIGNMENT.py file is now visible when marking. If everything looks okay, create the marking scheme, annotations, and assign graders.
Running the autotests makes use of the
CheckoutAndCommit.py script and the
MakeGradedAssignment.py script, both of which should be in the bin folder. Remember to update the variables in these scripts before running them in the steps below:
- Create a new folder in handin a##_autotest
- Download groups file from Markus->a##->groups->download. Put the file into marking->a## folder
- Update variables in CheckoutAndCommit.py script in the bin folder
- In terminal, run python CheckoutAndCommit.py checkout runcmds
- Create link between the a## and a##_autotest folders in the marking folder. In terminal, cd into the marking folder and run ln -s a## a##_autotest. **If you are doing checktests (testtests), run them now**
- Make sure the autotests are working properly (run rst on a working solution)
- To run the autotests on everyone, In terminal, run distrst -t t -s '*' a##_autotest 1 AUTOTESTRESULTS
- After the autotests and checktests are done running, update the variables in MakeGradedAssignment.py, then run python MakeGradedAssignment.py in terminal. This puts all of the results into one neat file which will be used to grade on Markus.
- Next, run python CheckoutAndCommit.py commit runcmds in terminal. This will allow Markus to recognize the changes.
- In the properties page of a## on Markus, change the deadline to just after the above script finished running. Then go to the submissions tab and press collect all submissions. *THIS IS THE ONLY TIME THIS BUTTON SHOULD BE PRESSED*. If this is pressed after marking has been done, all of it will be erased.
- Once all of the submission lines turn green, set the deadline back to the original deadline. All that is left to do is assign the graders, and then marking can begin!
make-all.py
Requirements for make-all.py
- Save make-all.py to your home directory and call the following script from anywhere in Linux
-
make-all.py {asst name} {number of questions}
- For example, calling
~/make-all.py a4 4
will lead to the creation of a testing suite for an assignment called a4
, that has 4 questions (or 4 files to be submitted)
- The first prompt will request a language. The options are
- beginner
- intermediate
- intermediate-lambda
- advanced
- python
- Next, if an alternate equality function is required (for checking inexact numbers or lists of inexact numbers, the user sets the alternate equality function for the required questions
Files and folders created by make-all.py
- Inside the handin folder, a folder is created with the name of the assignment. In that folder,
.subfiles
is created, with the required files to submit taking the form {ass't name}q{x}.{ss,py}
. For example, if in the above example of ~/make-all.py a4 4
, if we assume the language had been set to Scheme, .subfiles
will include:
- a4q1.ss
- a4q2.ss
- a4q3.ss
- a4q4.ss
- In the marking folder, a folder is created that holds the entire testing suite for the assignment. In the main folder, test.1 and test.pt are created for the autotesting and the public testing respectively. In those folders, the files,
mark-scheme
(a template), runTests
and computeMarks
are created, as well as the folders in
, provided
and answers
(if necessary). In the in
folder, the main options.ss
file is created, with the language specification. Also, a folder is created for each of the questions, each with its own options file, specifying the loadcode, as well as any alternate equality function.
make-tests.py
Using make-tests.py
This script will create all of the files and directories required to run tests on a question by reading a text file. To call
make-tests
from anywhere in the linux terminal, first save
make-tests.py
into
csXXX/bin/
and restart the terminal. You only need to do this once at the start of term. Each time you want to use the script, call the command
make-tests.py {asst name} {question number} {text file name}
. The text file must be saved in
marking/???/test.1/
, where
???
is the assignment name (such as a1 or a01), and should look something like this:
desc = y-intercept is a given point
(result (find-y-intercept (make-posn 0 1) (make-posn 1 0)))
(expected 1)
desc = y-intercept is origin
(result (find-y-intercept (make-posn 2 2) (make-posn -2 -2)))
(expected 0)
desc = general case
(result (find-y-intercept (make-posn 1 3) (make-posn 2 -5)))
(expected 11)
The script interprets blank lines in the text file as the end of a test; after the question tags, each test should be included in a block of text without blank lines, preceded by and followed by at least one blank line. The script will create test directories numbered sequentially beginning at t01.
The script checks each line of the text file for a specific set of tags (which are described in detail in the next section). The one tag you should use on every test is desc - this sets a description of the test, and gives students useful information on which tests they passed and which tests they failed. All tags have a default value, and so you can include as many or as few as needed for a given test; however, you should always include the desc tag, since its default value is just a number.
If a line does not begin with one of those tags, it will be written to the appropriate test.py or test.rkt file. Be sure to read the appropriate twiki pages to know what each test.py or test.rkt file should contain.
Question and Test Tags
Question tags are used to set values that will be the same for all tests of a given question, such as the language that will be used, or modules that need to be loaded. Test tags only apply to an individual test, and can do things like set keyboard input or change the mark value of an individual question. If a test and question tag disagree (such as over the mark value for a question), the test tag takes precedence.
Question and test tags can be further subdivided into two types: those which accept a single line of input, and those which can accept multiple lines of input.
- If a tag accepts a single line of input, the script will remove the tag from the beginning of the line, then strip the following characters from what remains: = { } ' " and any whitespace. The remaining text is treated as the argument for that tag.
- If a tag accepts multiple lines of input, the input must be contained within single or double braces (ie { and } or {{ and }} ). You cannot mix single and double braces; the option of double braces is to allow single braces within the argument for the tag. Any characters, including newlines, between the braces are copied verbatim as the argument to the tag. Any characters outside of the braces, including the braces themselves, are not.
Question Tags
The first lines of the text file can determine what is written in the options.rkt file for the question. While question tags usually don't need to be included (as in the example), there are situations where they are quite useful. If any of the following tags are included, they must be included in a single block (ie no blank lines between them) at the beginning of the file, and must be followed by at least one blank line before the first test-specific block. If none of these tags are included, an options.rkt file will still be written, and will contain the default loadcode and nothing else.
-
q_value
(single-line input) sets the default value for each test. If this tag is not included, each question will be worth one mark.
-
q_language
(single-line input) sets the language; valid options are the same as those for BitterSuite. If this tag is not included, any references to the language will be based on the language specified in the assignment's option file, which is created by make-all.py
.
-
q_loadcode
(single-line input) sets the loadcode. If this tag is not included, the loadcode is set to AQ.E
, where A
is the assignment name, Q
is the question number, and E
is the extension associated with the language in use (ie rkt for Scheme and py for Python).
-
q_desc
(single-line input) sets the question description. This description will be overwritten by any test-specific descriptions.
-
q_options
(multi-line input) adds its argument to the end of options.rkt. This can be useful for adding modules and equality functions.
Following the question option tags (if they are included), should be at least one blank line, and then the test information.
Test Tags
Other than the specific tags listed below, any line included in a particular test block will be written to the test.py or test.rkt file (depending on the language chosen). It is usually best to include any tags at the beginning of the test, but they can be included anywhere within the test block.
-
value
(single-line input) sets the value for an individual test.
-
desc
(single-line input) sets the description for an individual test.
-
options
(multi-line input) adds its argument to the end of the test-specific options.rkt.
-
input
(multi-line input) creates a file containing its argument. This file is used as keyboard input, as described at BitterSuitePythonTesting.
-
output
(multi-line input) creates a file containing its argument. This file is used as screen output, as described at BitterSuitePythonTesting.
-
file
works slightly differently than other tags. It takes a single-line input which it uses as the name for the file it creates (the file is created in test.1/provided
). The following line is treated as the beginning of a multi-line input, which is used as the contents of the file (this line should begin with an open brace or double brace).
A sample text file can be downloaded from this page.
Using make-marks.py
Since often you will want to create the testing suite folder before the marks breakdown has been given, this script will create the proper
mark-scheme
file. The only drawback is that the script assumes that there must be 10 marks assigned to autotesting for each question, and 10 marks to design recipe and style.
--
JonathanTemplin - 26 Aug 2010 *Note that the attachments are saved as
.py.txt
files,not just
.py
. Just copy them to your home directory and correct the file extension.
Suggestions for improvement
- Add version numbers to the comment headers so course staff can see at a glance if they need to update their local versions of the scripts; see OnlineMarkUpload for an example.
- Don't hardcode any course names; have this be autodetected by the scripts so no modification needs to be done after these files are downloaded. The same goes for detecting the current term (TermCode).
- Allow the set of accepted languages to be exactly what BitterSuite can see as languages so the script doesn't refuse to allow acceptable languages; this likely means accepting values like "scheme/beginner" instead of "beginner" to avoid confusion.
- Verify that BitterSuiteInitialSetup is done according to expectations.
- Make use of Advanced Shell Interpretation in the generated marking scheme so there is no restriction to a certain number of marks per question.