This script was written in Fall 2012 and first used extensively in Winter 2013. It automates the evaluation of student test cases so that grad TAs do not need to check by hand that students have sufficiently tested their code. It is designed only for Python code; a similar script exists for testing Racket code.

Before Using the Script

The script assumes that the directory /u/csXXX/marking/test_tests/ exists and is writeable, and that file tt_check.py (which can be downloaded from this page) is saved in the same directory as the test_tests script (this should be done already). It also reads in the expected test cases from a separate python file (usually saved to marking/tests_tests/aXX.py). The format of this file is described in the section "Expected Test Cases".

Script Usage Summary

  1. Check that everything is set up as describe in the section "Before Using the Script"
  2. Create a file as in the section "Expected Test Cases"
  3. Edit the variables in the test_tests file
  4. Run the test_tests script. The script file is saved in /u/csXXX/bin/, you can just call test_tests from any location. You can specify additional options as described in the section "Additional Options"
  5. The script will confirm the options you have chosen; type in y if they are good and n if they're not.

Expected Test Cases

This file should contain one list of test cases for each question; these lists are then referenced in the main script. A sample list is below:

q1b = [Case("10 of spades", lambda c, ans: c.suit == "spades" and c.value == 10),
          Case("A spade other than the 10", lambda c, ans: c.suit == "spades" and c.value != 10),
          Case("A card other than a spade", lambda c, ans: c.suit != "spades")]

Note: More samples from previous terms can be found in /u/cs116/marking/test_tests/

Each case is an instance of the Case class (defined in the main script), which has three fields:

  • The first is a label; this will be included (when applicable) in the list of tests missing for each question
  • The second is a function which consumes the same parameters as the function being tested, followed by a list of keyboard input values (in the same format as check.set_input), followed by the expected answer from check.expect/check.within. This function will consume the values from one of the student's check.expect/within tests, and should produce True if that particular test covers this case.
  • The third does not need to be included. It is an integer specifying how many check.expect/check.within tests need to pass the function in order for the student to have sufficiently tested that case. If it is not included, it is set to 1 by default.
Note that the function does not need to be a lambda function, although it is often easiest to do so. This file is evaluated as a normal python file, so functions can be defined using def before being called.

User Input

To deal with user input, add an additional parameter (lst) in the lambda function. This parameter is a list of all user input to be used in the test (ie, it is equivalent to the list in the student's check.set_input call). For example, if the function consumes one parameter (astring), and takes in user input, the test case could be written as

Case("Test description", lambda astring, lst, ans: astring = "abcd" and len(lst[0].strip())>1 and ans > 0)

where astring is the parameter, lst is the list of user inputs, and ans is the expected result of the test given by the student in their check.expect.

As you can see in the above example, we can extract each user input string by calling lst[i]. We could also use len(lst) to determine how many inputs the student's test case is using.

Class Objects

If the assignment deals with any user-defined classes, the script requires the repr function for each class to be defined in a very specific way (probably different than the one in the assignment). The function should produce a call to the class; for example, assume the question uses a card class which is initialized by card(5, "spades"). To use this, include an entry in the repr_dict dictionary (defined in the script) with key equal to the name of the class as a string (in this example, "card"), and value similar to the following string "'classname(%s, %s,...)' % (repr(self.field1), repr(self.field2), ...)" (in this case, "'card(%s, %s)' % (repr(self.suit), repr(self.value))").

Additional Options

The script is equipped with three options when running:

  • -h displays help information, including a description of all three options.
  • -n X runs the script on X random students
  • -s X runs the script on the single student with userid X


  • For the file assignment, if the function consumes the name of the file and produces None (i.e. writing to a file), you cannot create test cases for that question.
Last Edited By:

-- ScottFoggo - 04 Apr 2014

-- PeterSinclair - 12 Apr 2013

Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt a09.py.txt r1 manage 2.4 K 2013-04-16 - 15:48 PeterSinclair Sample test cases file
Unknown file formatext test_tests r1 manage 14.1 K 2013-04-16 - 15:47 PeterSinclair Actual Script
Texttxt tt_check.py.txt r1 manage 3.1 K 2013-04-16 - 15:47 PeterSinclair Modified check module.
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r5 - 2014-04-08 - ScottFoggo
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback