BitterSuite Python Testing
Getting Started: test.pt and test.1 folder layout
Python testing using
RST/Bittersuite is similar to
Racket testing. You have to create folders called
test.pt
(for public test) and
test.1
(for tests you run after due date). The
test.pt
and
test.1
folders go inside
marking/aXX
folder in the course account (
aXX
is assignment number, ex. a04).
Here is an example Python testing. In this example, there are two questions:
- Question 1 asks students to define
cube(n)=n*n*n
. Solutions
- Question 2 asks students to read from the keyboard and print some stuff to the screen. Solutions
Here is what the
test.pt
folder may look like:
Example test output created by the above test suite
Your test cases are placed in
test.py
, which is structured as a regular Python script, and is executed like one. Student code is available and acts as if the line
from (studentcode) import *
had been executed beforehand (i.e. all student code is available in the global scope).
Two variables in
test.py have special meaning:
result
and
expected
.
result
should be set so that it receives the return value from the function which should be tested (though there are more advanced ways to use it: see examples later). As the name implies, the value of
expected
should be the value you expect to be produced by the student code.
RST/BitterSuite will automatically compare
result
and
expected
using
&eq;&eq;
, although the comparison function can be changed (see Equality Testing below)
If there is a file called
input in the test directory (i.e. alongside
test.py), that file will be sent as standard (keyboard) input as
test.py is executed: this can be used to fake input from the interactive shell, for example.
The Python testing system will always reload a student's code just once per question
(loadcode "aXXqY.py")
, even if the question uses mutation (this is in contrast to Scheme, where a
loadcode
command is specified for every test if mutation is used).
The important thing to remember when making
test.py is that it is executed as a Python script might, so it has access to all features of the language. Hence, you can do some fairly advanced stuff here. However, if you find yourself repeating the same tricks over and over again in separate tests, you may wish to make that behaviour into a module if possible.
provided/ is in the module search path of
test.py, so any modules you put there are accessible to
test.py simply by performing an
import
statement; you do not need to use include anything in the options file in order to do this.
Creating the test.pt and test.1 folders
One way to create the
test.pt
and
test.1
folders is to create the folders and files manually (ie right-click in Finder window to create the folder, and create the files using a text editor). This method works but is very slow.
Instead, you can use the scripts
make-all.py
and
make-tests.py
described in
AutomatedAutotestCreation to create the folders automatically.
Some examples of test.py
result = student_func(1,2,3)
expected = 13
This calls
student_func
and stores the return value in
result
. The testing suite will check that
result
is equal to 13, and will print appropriate messages if an error occurs (e.g. student_func throws an exception) or if the answer is not correct.
variable = 12
student_mutate(variable)
result = variable
expected = 8
Result does not always have to be set to the value of the function call. In cases of mutation, for example, the output of the function is often unimportant, and setting result to a mutated variable makes more sense.
In the case where there are multiple values to test (perhaps both a mutated variable and the function output are important), you have two choices: either set result to be a list or dictionary containing those values, or run one test for each important value. Running multiple tests is clearer for students and allows for part marks, so this is often the better choice when there are only two or three important values. However, when several values are important, the clarity is lost in the sheer number of tests and the part marks are probably insignificant, so making a list or dictionary likely makes more sense.
try:
result = crashy_student_code()
except:
result = "Error occured"
finally:
# put important cleanup here
expected = "One"
If there is important clean up to do after running student code, such as closing a file, it is often good to use try/except/finally blocks like the example above. These are also useful if you are testing students' error handling. For example, if they are only supposed to stop
ZeroDivisonErrors, you may want to cause a different type of error but still have the student pass the test; without a try/except block, this is impossible.
Input/Output
When a function has effects other than consuming and producing values, tests become more complicated. Use the methods below.
Screen Output
To test screen output (print statements) in student code, use the
redirect_output
module (which you can download from this page, or from the direcory
/u/cs116/marking/Useful Modules
). A simple example is below:
from redirect_output import *
result = redirect_output(studentfunction, [arg1, arg2])
expected = ["First line of output", "Second line"]
The
redirect_output
function (inside the module of the same name) consumes two arguments: a function and the arguments to that function (as a list), and produces the screen output as a list of strings. Each string in the list is one line of screen output with the newline character removed.
You can of course do more with this. If only particular lines matter, you can take individual elements from the list. If only particular characters in the output matter (maybe each line includes the score of a particular stage of a game), then using a for loop to replace each line with the relevant value will make the output shorter, and will prevent penalizing students for a typo in an unimportant part of the output.
IMPORTANT:
redirect_output
produces a list of lines rather than a single string for a reason.
BitterSuite will crash with a cadr error if result or expected contains a newline character. By using a list of lines, the newline character will never appear in the output, but misplaced newline characters will still cause the test to fail (since the lines will be split in the wrong spot).
Keyboard Input
To test keyboard input, save a file named
input
(no file extension) in the same directory as
test.py
. Each time the programs calls raw_input, one line of the file
input
is read as the keyboard input.
When keyboard input is combined with screen or file output, use the
suppress prompt
module. Copy the module file from the
marking/Useful Modules
directory into the
provided
folder, and include the line
(modules "suppress_prompt")
in the options.ss file. If this module is not included, the prompts in the students code will appear in the screen output, and make testing more difficult.
File Input
To test file input, save a file (which will be used by the function) in the directory
test.X/provided
. When writing the test, treat the input file as if it is in the same directory as test.py.
File Output - Option 1: Reading from the Output File
To test file output, copy fileiomod.py from the directory
/u/cs116/marking/Useful Modules
into the provided fouler. In the test.py file, include the line
from fileiomod import get_file_lines
at the beginning, and include the line
get_file_lines(filename)
at the end, where
filename
is the name of the file that should be created by the function. The function
get_file_lines
will read in and ultimately produce the contents of the created file as a list of strings (where each string represents a single line in the file). Compare the contents of the output file represented by this list of strings with the contents that you expect in the test.py file.
For example, suppose a student function named
write_to_file
consumes a string representing the name of the output file and writes the following lines to the file:
line 1
line 2
line 3
In the test.py file, you would write the following to test that the student's function
write_to_file
writes the correct lines to the file:
from fileiomod import get_file_lines
try:
write_to_file('temp')
try:
studentanswer = get_file_lines('temp')
result = studentanswer
expected = ['line 1\n', 'line 2\n', 'line 3']
except:
result = 'an error while reading the produced file;'
expected = 'no errors.'
except:
result = 'an error while running write_to_file;'
expected = 'no errors.'
File Output - Option 2: Testing like Screen Output
To test file output copy dumpfile.py from the directory
marking/Useful Modules
into the
provided
folder. In the test.py file, include the line
from dumpfile import dumpfile
at the beginning, and include the line
dumpfile(filename)
at the end, where
filename
is the name of the file that should be created by the function. Dumpfile will print the contents of the produced file to the screen, and so the produced file can be tested in the same way as screen output.
Equality testing
By specifying the option
(equal "...")
in an options.ss file, you can control how equality is checked. The default is Python's built in "==" test, which performs deep equality testing on all built-in types (i.e. objects compare the same if they have the same contents) and shallow equality for class instances which don't define
eq.
The option must be specified as a lambda expression taking two values, or the invocation of a function which returns such a function (confusing?). For example
(equal "lambda x,y: x is y")
gives Python's standard shallow equality tester (e.g.
[1] is [1]
yields
False
, but
1 is 1
yields
True
; two variables which refer to the same object in memory are also equal under shallow equality). This is somewhat similar to Scheme's
eq?
predicate.
NOTE: The directory
/u/cs116/marking/Useful Modules
includes a file
equality.py
which should be useful in defining equality checks. Read the instructions in that file, and keep in mind that it has not be thoroughly tested.
Questions with Dependencies
From time to time, an assignment will have a question which wants to use another question as a helper function. Perhaps question 1a builds a game board, and question 1b plays the game. It may be desirable to run question 1b using the model solution version of question 1a, instead of the student version. This allows students who struggle on part a to still make an attempt at part b. To do this takes two steps: you first need to delete the student's version of the code, and then provide the correct version.
If both functions are defined in the same file (perhaps 1a and 1b are both included in the file
a8q1.py
), and if
model_solns.py
is a file in the
aX/test.Y/provided
directory, you could include the following code in
test.py
try: del make_game_board
except: pass
from model_solns import play_game
The try/except is important in case the student didn't define
make_game_board
; if that is the case,
del make_game_board
will produce an error. You could also copy and paste the definition of
play_game
into each
test.py
file, but importing from a single file makes it easier to change the model solution (if necessary).
If the two questions are in different files and students have to import their file from the previous question(s): The idea is that for each question N, we'll create a new directory that contains the student's file for question N, and the solutions for all previous questions (1 to N-1).
For example, suppose Q3 uses Q1 and Q2. In the Q3 file, students have to import their code from Q1 and Q2 like this:
# student's a8q3 file:
from a8q1 import my_q1_function
from a8q2 import my_q2_function
Inside the
computeMarks
script, we'll create a folder called
q3_directory
, copy the solutions for Q1 and Q2 there, and copy the student's Q3 file there. Then in
options.rkt
, we'll use
(loadcode "q3_directory/a8q3.py")
.
Here is what
computeMarks
might look like. In this code, the Q1 solutions are saved at
solutions/a8q1-solutions.py
inside the
test.pt
and
test.1
folders. Similarly, the Q2 solutions are saved as
a8q2-solutions.py
inside the
solutions
folder.
#!/bin/bash
# Create directory for Q3:
mkdir q3_directory
# Copy solutions for Q1-2 into Q3 directory:
cp "${testdir}/solutions/a8q1-solutions.py" "q3_directory/a8q1.py"
cp "${testdir}/solutions/a8q2-solutions.py" "q3_directory/a8q2.py"
# If student submitted a8q3.py, copy it into the Q3 directory
if [ -e "${submitdir}/a8q3.py" ]; then
cp "${submitdir}/a8q3.py" "q3_directory/a8q3.py"
fi
# In options.rkt, use: (loadcode "q3_directory/a8q3.py")
exec /u/isg/bittersuite3/computeMarks -q
Then in
options.rkt
, use
(loadcode "q3_directory/a8q3.py")
. When Python runs
q3_directory/a8q3.py
, it will search for imported files from the
q3_directory
first. Since we copied the solutions into
q3_directory
, the solutions will be imported instead of the student's own files.
Full example of using files from previous questions
Sample Q1:
In
a01q1.py
, create a class called
MyQ1Class
that contains a function
message(string)
, which should return the string "Q1 Correct Message: <input string>"
Sample Q2 (depends on Q1):
In
a01q2.py
, create a class called
MyQ2Class
that contains a function
message(string)
, which should return the string "Q2 Correct Message: <input string>"
Also define a function called
join_messages(string)
that joins the result of
MyQ1Class.message(string)
and
MyQ2Class.message(string)
separated by a newline. Use your solution from Q1 to get MyQ1Class.
Sample Q3 (depends on Q1-2):
In
a01q3.py
, create a class called
MyQ3Class
that contains a function
message(string)
, which should return the string "Q3 Correct Message: <input string>"
Also define a function called
join_messages(string)
that joins the result of
MyQ1Class.message(string)
,
MyQ2Class.message(string)
, and
MyQ3Class.message(string)
separated by a newline. Use your solutions from previous questions by importing the files.
Sample Q4 (depends on Q1-3):
In
a01q4.py
, create a class called
MyQ4Class
that contains a function
message(string)
, which should return the string "Q4 Correct Message: <input string>"
Also define a function called
join_messages(string)
that joins the result of
MyQ1Class.message(string)
,
MyQ2Class.message(string)
,
MyQ3Class.message(string)
, and
MyQ4Class.message(string)
separated by a newline. Use your solutions from previous questions by importing the files.
Example test suite:
The differences between this test suite and what you normally do are:
- The solutions are placed inside the
solutions
folder.
- In
computeMarks
, we create a directory for each question that has dependencies. Each question's directory has the student's file for that question, and the solutions from previous questions.
- In
options.rkt
, we load the file from the question's directory. For example, use (loadcode "q2_directory/a01q2.py")
instead of just (loadcode "a01q2.py")
.
Download example test suite:
https://cs.uwaterloo.ca/twiki/pub/ISG/BitterSuitePythonTesting/question_dependency_example.zip
Modules
See
BitterSuitePythonModules.
*******************************************************************************
Assignment a01 Public Testing
These tests are provided to check that your functions perform correctly on
simple cases. They do not guarantee that your answers are 100% correct. You
should create more tests to ensure that your answers are correct.
**** Testing Results **********************************************************
3/4 Total Mark
** Question 1: 1/2
** Question 2: 2/2
(Question 1, Test t01, 1 marks): Testing cube(3): FAILED; FAILED: got 81 ex-
pected 27
(Question 1, Test t02, 1 marks): Testing cube(0): Passed; Congrats! You
passed!
(Question 2, Test t01, 1 marks): Checking Question 2: Passed; passed.
(Question 2, Test t02, 1 marks): Checking Question 2: Passed; passed.
**** End of files *************************************************************
# File: a01q1.py
def cube(n):
return n*n*n
# File: a01q2.py
def greeting(month, year):
first_name = input("Enter first name: ")
last_name = input("Enter last name: ")
print("Hello %s %s!" % (first_name, last_name))
print("You were born in %s %d" % (month, year))
return 2000 - year
#!/bin/sh
exec /u/isg/bittersuite3/computeMarks -q
(verbosity 1)
(print-submit-files false)
(print-by-question true)
(nroff-mark-scheme true)
(interpret-mark-scheme false)
(loadcode "a01q1.py")
(desc "Testing cube(3)")
result = cube(3)
expected = 27
# Custom pass message:
pass_message = "Congrats! You passed!"
(desc "Testing cube(0)")
result = cube(0)
expected = 0
# Custom pass message:
pass_message = "Congrats! You passed!"
(loadcode "a01q2.py")
(desc "Checking Question 2")
Justin
Trudeau
from redirect_output import *
# result will be a 2-tuple. First element is the function return value,
# the second element is screen output
result = redirect_output(greeting, ("December", 1971))
expected = (29, ["Hello Justin Trudeau!", "You were born in December 1971"])
Taylor
Swift
from redirect_output import *
# result will be a 2-tuple. First element is the function return value,
# the second element is screen output
result = redirect_output(greeting, ("December", 1989))
expected = (11, ["Hello Taylor Swift!", "You were born in December 1989"])
(language python)
(modules "suppress_prompt")
(value 1)
; Set a timeout of 5 seconds
(timeout 5)
Assignment a01 Public Testing
These tests are provided to check that your functions perform correctly on
simple cases. They do not guarantee that your answers are 100% correct.
You should create more tests to ensure that your answers are correct.
import sys
backup_stdout = sys.stdout
class output:
"""
Screen output is redirected to this class
whenever set_screen is called.
"""
def __init__(self):
self.screen = ""
def __str__(self):
return self.screen
def __nonzero__(self):
return bool(self.screen)
def write(self, string):
self.screen += string
def reset(self):
self.screen = ""
def redirect_output(func, args = None):
temp_screen = output()
sys.stdout = temp_screen
if args == None:
retval = func()
else:
retval = func(*args)
sys.stdout = backup_stdout
return (retval, temp_screen.screen.split('\n')[:-1])
# This module redefines input to suppress the prompt string.
# Load it by using (modules "suppress_prompt") in options.ss.
old_input = input
__builtins__['input'] = lambda prompt='': old_input()
#!/bin/bash
# Create a directory for each question that has dependencies.
# The directory will contain solutions for the previous questions.
# Q1 has no dependencies.
# Create directory for Q2:
mkdir q2_directory
# Copy solutions for Q1 into Q2 directory:
cp "${testdir}/solutions/a01q1-solutions.py" "q2_directory/a01q1.py"
# If student submitted Q2 file, copy the student's Q2 file into the Q2 directory.
if [ -e "${submitdir}/a01q2.py" ]; then
cp "${submitdir}/a01q2.py" "q2_directory/a01q2.py"
fi
# In options.rkt, use: (loadcode "q2_directory/a01q2.py")
# Create directory for Q3:
mkdir q3_directory
# Copy solutions for Q1-2 into Q3 directory:
cp "${testdir}/solutions/a01q1-solutions.py" "q3_directory/a01q1.py"
cp "${testdir}/solutions/a01q2-solutions.py" "q3_directory/a01q2.py"
# Copy the student's Q3 file into the Q3 directory
if [ -e "${submitdir}/a01q3.py" ]; then
cp "${submitdir}/a01q3.py" "q3_directory/a01q3.py"
fi
# In options.rkt, use: (loadcode "q3_directory/a01q3.py")
# Create directory for Q4:
mkdir q4_directory
# Copy solutions for Q1-3 into Q4 directory:
cp "${testdir}/solutions/a01q1-solutions.py" "q4_directory/a01q1.py"
cp "${testdir}/solutions/a01q2-solutions.py" "q4_directory/a01q2.py"
cp "${testdir}/solutions/a01q3-solutions.py" "q4_directory/a01q3.py"
# Copy the student's Q4 file into the Q4 directory
if [ -e "${submitdir}/a01q4.py" ]; then
cp "${submitdir}/a01q4.py" "q4_directory/a01q4.py"
fi
# In options.rkt, use: (loadcode "q4_directory/a01q4.py")
exec /u/isg/bittersuite3/computeMarks -q
(verbosity 1)
(print-submit-files false)
(print-by-question true)
(nroff-mark-scheme true)
(interpret-mark-scheme false)
(loadcode "a01q1.py")
result = MyQ1Class.message("Q1 test 01")
expected = "Q1 Correct Message: Q1 test 01"
result = MyQ1Class.message("Q1 test 02")
expected = "Q1 Correct Message: Q1 test 02"
(loadcode "q2_directory/a01q2.py")
result = join_messages("Q2 test 01")
expected = "Q1 Correct Message: Q2 test 01\n\
Q2 Correct Message: Q2 test 01"
result = join_messages("Q2 test 02")
expected = "Q1 Correct Message: Q2 test 02\n\
Q2 Correct Message: Q2 test 02"
(loadcode "q3_directory/a01q3.py")
result = join_messages("Q3 test 01")
expected = "Q1 Correct Message: Q3 test 01\n\
Q2 Correct Message: Q3 test 01\n\
Q3 Correct Message: Q3 test 01"
result = join_messages("Q3 test 02")
expected = "Q1 Correct Message: Q3 test 02\n\
Q2 Correct Message: Q3 test 02\n\
Q3 Correct Message: Q3 test 02"
(loadcode "q4_directory/a01q4.py")
result = join_messages("Q4 test 01")
expected = "Q1 Correct Message: Q4 test 01\n\
Q2 Correct Message: Q4 test 01\n\
Q3 Correct Message: Q4 test 01\n\
Q4 Correct Message: Q4 test 01"
result = join_messages("Q4 test 02")
expected = "Q1 Correct Message: Q4 test 02\n\
Q2 Correct Message: Q4 test 02\n\
Q3 Correct Message: Q4 test 02\n\
Q4 Correct Message: Q4 test 02"
(language python)
(value 1)
; Set a timeout of 5 seconds
(timeout 5)
# a01q1 model solutions
class MyQ1Class:
def message(string):
return "Q1 Correct Message: %s" % string
# a01q2 model solutions
from a01q1 import MyQ1Class
class MyQ2Class:
def message(string):
return "Q2 Correct Message: %s" % string
def join_messages(string):
return "\n".join([MyQ1Class.message(string), MyQ2Class.message(string)])
# a01q3 model solutions
from a01q1 import MyQ1Class
from a01q2 import MyQ2Class
class MyQ3Class:
def message(string):
return "Q3 Correct Message: %s" % string
def join_messages(string):
return "\n".join([MyQ1Class.message(string), MyQ2Class.message(string), MyQ3Class.message(string)])