ISG Web>ISGScripts>Testing>BitterSuite>BitterSuitePythonTesting (2021-06-14, YiLee)

BitterSuite Python Testing

BitterSuite Python Testing

Getting Started: test.pt and test.1 folder layout

Python testing using RST/Bittersuite is similar to Racket testing. You have to create folders called test.pt (for public test) and test.1 (for tests you run after due date). The test.pt and test.1 folders go inside marking/aXX folder in the course account ( aXX is assignment number, ex. a04).

Here is an example Python testing. In this example, there are two questions:

Question 1 asks students to define cube(n)=n*n*n. Solutions
Question 2 asks students to read from the keyboard and print some stuff to the screen. Solutions

Here is what the test.pt folder may look like:

test.pt/
- answers/
- computeMarks
- config.ss
- mark-scheme
- provided/
  - redirect_output.py
  - suppress_prompt.py
- in/
  - options.rkt
  - 1/
    - options.rkt
    - t01/
      - options.rkt
      - test.py
    - t02/
      - options.rkt
      - test.py
  - 2/
    - options.rkt
    - t01/
      - input
      - test.py
    - t02/
      - input
      - test.py

Example test output created by the above test suite

Your test cases are placed in test.py, which is structured as a regular Python script, and is executed like one. Student code is available and acts as if the line from (studentcode) import * had been executed beforehand (i.e. all student code is available in the global scope).

Two variables in test.py have special meaning: result and expected. result should be set so that it receives the return value from the function which should be tested (though there are more advanced ways to use it: see examples later). As the name implies, the value of expected should be the value you expect to be produced by the student code. RST/BitterSuite will automatically compare result and expected using &eq;&eq;, although the comparison function can be changed (see Equality Testing below)

If there is a file called input in the test directory (i.e. alongside test.py), that file will be sent as standard (keyboard) input as test.py is executed: this can be used to fake input from the interactive shell, for example.

The Python testing system will always reload a student's code just once per question (loadcode "aXXqY.py"), even if the question uses mutation (this is in contrast to Scheme, where a loadcode command is specified for every test if mutation is used).

The important thing to remember when making test.py is that it is executed as a Python script might, so it has access to all features of the language. Hence, you can do some fairly advanced stuff here. However, if you find yourself repeating the same tricks over and over again in separate tests, you may wish to make that behaviour into a module if possible.

provided/ is in the module search path of test.py, so any modules you put there are accessible to test.py simply by performing an import statement; you do not need to use include anything in the options file in order to do this.

Creating the test.pt and test.1 folders

One way to create the test.pt and test.1 folders is to create the folders and files manually (ie right-click in Finder window to create the folder, and create the files using a text editor). This method works but is very slow.

Instead, you can use the scripts make-all.py and make-tests.py described in AutomatedAutotestCreation to create the folders automatically.

Some examples of test.py

result = student_func(1,2,3)
expected = 13

This calls student_func and stores the return value in result. The testing suite will check that result is equal to 13, and will print appropriate messages if an error occurs (e.g. student_func throws an exception) or if the answer is not correct.

variable = 12
student_mutate(variable)
result = variable
expected = 8

Result does not always have to be set to the value of the function call. In cases of mutation, for example, the output of the function is often unimportant, and setting result to a mutated variable makes more sense.

In the case where there are multiple values to test (perhaps both a mutated variable and the function output are important), you have two choices: either set result to be a list or dictionary containing those values, or run one test for each important value. Running multiple tests is clearer for students and allows for part marks, so this is often the better choice when there are only two or three important values. However, when several values are important, the clarity is lost in the sheer number of tests and the part marks are probably insignificant, so making a list or dictionary likely makes more sense.

try:
   result = crashy_student_code()
except:
   result = "Error occured"
finally:
   # put important cleanup here
expected = "One"

If there is important clean up to do after running student code, such as closing a file, it is often good to use try/except/finally blocks like the example above. These are also useful if you are testing students' error handling. For example, if they are only supposed to stop ZeroDivisonErrors, you may want to cause a different type of error but still have the student pass the test; without a try/except block, this is impossible.

Input/Output

When a function has effects other than consuming and producing values, tests become more complicated. Use the methods below.

Screen Output

To test screen output (print statements) in student code, use the redirect_output module (which you can download from this page, or from the direcory /u/cs116/marking/Useful Modules). A simple example is below:

from redirect_output import *
result = redirect_output(studentfunction, [arg1, arg2])
expected = ["First line of output", "Second line"]

The redirect_output function (inside the module of the same name) consumes two arguments: a function and the arguments to that function (as a list), and produces the screen output as a list of strings. Each string in the list is one line of screen output with the newline character removed.

You can of course do more with this. If only particular lines matter, you can take individual elements from the list. If only particular characters in the output matter (maybe each line includes the score of a particular stage of a game), then using a for loop to replace each line with the relevant value will make the output shorter, and will prevent penalizing students for a typo in an unimportant part of the output.

IMPORTANT: redirect_output produces a list of lines rather than a single string for a reason. BitterSuite will crash with a cadr error if result or expected contains a newline character. By using a list of lines, the newline character will never appear in the output, but misplaced newline characters will still cause the test to fail (since the lines will be split in the wrong spot).

Keyboard Input

To test keyboard input, save a file named input (no file extension) in the same directory as test.py. Each time the programs calls raw_input, one line of the file input is read as the keyboard input.

When keyboard input is combined with screen or file output, use the suppress prompt module. Copy the module file from the marking/Useful Modules directory into the provided folder, and include the line (modules "suppress_prompt") in the options.ss file. If this module is not included, the prompts in the students code will appear in the screen output, and make testing more difficult.

File Input

To test file input, save a file (which will be used by the function) in the directory test.X/provided. When writing the test, treat the input file as if it is in the same directory as test.py.

File Output - Option 1: Reading from the Output File

To test file output, copy fileiomod.py from the directory /u/cs116/marking/Useful Modules into the provided fouler. In the test.py file, include the line from fileiomod import get_file_lines at the beginning, and include the line get_file_lines(filename) at the end, where filename is the name of the file that should be created by the function. The function get_file_lines will read in and ultimately produce the contents of the created file as a list of strings (where each string represents a single line in the file). Compare the contents of the output file represented by this list of strings with the contents that you expect in the test.py file.

For example, suppose a student function named write_to_file consumes a string representing the name of the output file and writes the following lines to the file:

line 1
line 2
line 3

In the test.py file, you would write the following to test that the student's function write_to_file writes the correct lines to the file:

from fileiomod import get_file_lines
try:
   write_to_file('temp')
   try:
      studentanswer = get_file_lines('temp')
      result = studentanswer
      expected = ['line 1\n', 'line 2\n', 'line 3']
   except:
      result = 'an error while reading the produced file;'
      expected = 'no errors.'
except:
   result = 'an error while running write_to_file;'
   expected = 'no errors.'

File Output - Option 2: Testing like Screen Output

To test file output copy dumpfile.py from the directory marking/Useful Modules into the provided folder. In the test.py file, include the line from dumpfile import dumpfile at the beginning, and include the line dumpfile(filename) at the end, where filename is the name of the file that should be created by the function. Dumpfile will print the contents of the produced file to the screen, and so the produced file can be tested in the same way as screen output.

Equality testing

By specifying the option (equal "...") in an options.ss file, you can control how equality is checked. The default is Python's built in "==" test, which performs deep equality testing on all built-in types (i.e. objects compare the same if they have the same contents) and shallow equality for class instances which don't define eq.

The option must be specified as a lambda expression taking two values, or the invocation of a function which returns such a function (confusing?). For example

(equal "lambda x,y: x is y")

gives Python's standard shallow equality tester (e.g. [1] is [1] yields False, but 1 is 1 yields True; two variables which refer to the same object in memory are also equal under shallow equality). This is somewhat similar to Scheme's eq? predicate.

NOTE: The directory /u/cs116/marking/Useful Modules includes a file equality.py which should be useful in defining equality checks. Read the instructions in that file, and keep in mind that it has not be thoroughly tested.

Questions with Dependencies

From time to time, an assignment will have a question which wants to use another question as a helper function. Perhaps question 1a builds a game board, and question 1b plays the game. It may be desirable to run question 1b using the model solution version of question 1a, instead of the student version. This allows students who struggle on part a to still make an attempt at part b. To do this takes two steps: you first need to delete the student's version of the code, and then provide the correct version.

If both functions are defined in the same file (perhaps 1a and 1b are both included in the file a8q1.py), and if model_solns.py is a file in the aX/test.Y/provided directory, you could include the following code in test.py

try:    del make_game_board
except: pass
from model_solns import play_game

The try/except is important in case the student didn't define make_game_board; if that is the case, del make_game_board will produce an error. You could also copy and paste the definition of play_game into each test.py file, but importing from a single file makes it easier to change the model solution (if necessary).

If the two questions are in different files and students have to import their file from the previous question(s): The idea is that for each question N, we'll create a new directory that contains the student's file for question N, and the solutions for all previous questions (1 to N-1).

For example, suppose Q3 uses Q1 and Q2. In the Q3 file, students have to import their code from Q1 and Q2 like this:

# student's a8q3 file:
from a8q1 import my_q1_function
from a8q2 import my_q2_function

Inside the computeMarks script, we'll create a folder called q3_directory, copy the solutions for Q1 and Q2 there, and copy the student's Q3 file there. Then in options.rkt, we'll use (loadcode "q3_directory/a8q3.py").

Here is what computeMarks might look like. In this code, the Q1 solutions are saved at solutions/a8q1-solutions.py inside the test.pt and test.1 folders. Similarly, the Q2 solutions are saved as a8q2-solutions.py inside the solutions folder.

#!/bin/bash

# Create directory for Q3:
mkdir q3_directory
# Copy solutions for Q1-2 into Q3 directory:
cp "${testdir}/solutions/a8q1-solutions.py" "q3_directory/a8q1.py"
cp "${testdir}/solutions/a8q2-solutions.py" "q3_directory/a8q2.py"
# If student submitted a8q3.py, copy it into the Q3 directory
if [ -e "${submitdir}/a8q3.py" ]; then
  cp "${submitdir}/a8q3.py" "q3_directory/a8q3.py"
fi
# In options.rkt, use: (loadcode "q3_directory/a8q3.py")

exec /u/isg/bittersuite3/computeMarks -q

Then in options.rkt, use (loadcode "q3_directory/a8q3.py"). When Python runs q3_directory/a8q3.py, it will search for imported files from the q3_directory first. Since we copied the solutions into q3_directory, the solutions will be imported instead of the student's own files.

Full example of using files from previous questions

Sample Q1:

In a01q1.py, create a class called MyQ1Class that contains a function message(string), which should return the string "Q1 Correct Message: <input string>"

Sample Q2 (depends on Q1):

In a01q2.py, create a class called MyQ2Class that contains a function message(string), which should return the string "Q2 Correct Message: <input string>"

Also define a function called join_messages(string) that joins the result of MyQ1Class.message(string) and MyQ2Class.message(string) separated by a newline. Use your solution from Q1 to get MyQ1Class.

Sample Q3 (depends on Q1-2):

In a01q3.py, create a class called MyQ3Class that contains a function message(string), which should return the string "Q3 Correct Message: <input string>"

Also define a function called join_messages(string) that joins the result of MyQ1Class.message(string), MyQ2Class.message(string), and MyQ3Class.message(string) separated by a newline. Use your solutions from previous questions by importing the files.

Sample Q4 (depends on Q1-3):

In a01q4.py, create a class called MyQ4Class that contains a function message(string), which should return the string "Q4 Correct Message: <input string>"

Also define a function called join_messages(string) that joins the result of MyQ1Class.message(string), MyQ2Class.message(string), MyQ3Class.message(string), and MyQ4Class.message(string) separated by a newline. Use your solutions from previous questions by importing the files.

Example test suite:

The differences between this test suite and what you normally do are:

The solutions are placed inside the solutions folder.
In computeMarks, we create a directory for each question that has dependencies. Each question's directory has the student's file for that question, and the solutions from previous questions.
In options.rkt, we load the file from the question's directory. For example, use (loadcode "q2_directory/a01q2.py") instead of just (loadcode "a01q2.py").

Download example test suite: https://cs.uwaterloo.ca/twiki/pub/ISG/BitterSuitePythonTesting/question_dependency_example.zip

test.pt/
- answers/
- computeMarks
- config.ss
- in/
  - 1/
    - options.rkt
    - t01/
      - test.py
    - t02/
      - test.py
  - 2/
    - options.rkt
    - t01/
      - test.py
    - t02/
      - test.py
  - 3/
    - options.rkt
    - t01/
      - test.py
    - t02/
      - test.py
  - 4/
    - options.rkt
    - t01/
      - test.py
    - t02/
      - test.py
  - options.rkt
- solutions/

Modules

See BitterSuitePythonModules.

*******************************************************************************

Assignment a01 Public Testing

These  tests  are  provided  to check that your functions perform correctly on
simple cases. They do not guarantee that your answers are 100%  correct.   You
should create more tests to ensure that your answers are correct.
**** Testing Results **********************************************************

3/4   Total Mark

 ** Question 1: 1/2
 ** Question 2: 2/2

(Question 1, Test t01, 1 marks): Testing cube(3): FAILED; FAILED: got 81 ex-
    pected 27
(Question 1, Test t02, 1 marks): Testing cube(0): Passed; Congrats! You
    passed!
(Question 2, Test t01, 1 marks): Checking Question 2: Passed; passed.
(Question 2, Test t02, 1 marks): Checking Question 2: Passed; passed.
**** End of files *************************************************************

# File: a01q1.py
def cube(n):
    return n*n*n

# File: a01q2.py
def greeting(month, year):
    first_name = input("Enter first name: ")
    last_name = input("Enter last name: ")
    print("Hello %s %s!" % (first_name, last_name))
    print("You were born in %s %d" % (month, year))
    return 2000 - year

#!/bin/sh
exec /u/isg/bittersuite3/computeMarks -q

(verbosity 1)
(print-submit-files false)
(print-by-question true)
(nroff-mark-scheme true)
(interpret-mark-scheme false)

(loadcode "a01q1.py")

(desc "Testing cube(3)")

result = cube(3)
expected = 27
# Custom pass message:
pass_message = "Congrats! You passed!"

(desc "Testing cube(0)")

result = cube(0)
expected = 0
# Custom pass message:
pass_message = "Congrats! You passed!"

(loadcode "a01q2.py")
(desc "Checking Question 2")

Justin
Trudeau

from redirect_output import *

# result will be a 2-tuple. First element is the function return value,
# the second element is screen output
result = redirect_output(greeting, ("December", 1971))
expected = (29, ["Hello Justin Trudeau!", "You were born in December 1971"])

Taylor
Swift

from redirect_output import *

# result will be a 2-tuple. First element is the function return value,
# the second element is screen output
result = redirect_output(greeting, ("December", 1989))
expected = (11, ["Hello Taylor Swift!", "You were born in December 1989"])

(language python)
(modules "suppress_prompt")
(value 1)
; Set a timeout of 5 seconds
(timeout 5)

Assignment a01 Public Testing

These tests are provided to check that your functions perform correctly on
simple cases. They do not guarantee that your answers are 100% correct.
You should create more tests to ensure that your answers are correct.

import sys

backup_stdout = sys.stdout

class output:
    """
    Screen output is redirected to this class
    whenever set_screen is called.
    """
    def __init__(self):
        self.screen = ""
    def __str__(self):
        return self.screen
    def __nonzero__(self):
        return bool(self.screen)
    def write(self, string):
        self.screen += string
    def reset(self):
        self.screen = ""

def redirect_output(func, args = None):
    temp_screen = output()
    sys.stdout = temp_screen
    if args == None:
        retval = func()
    else:
        retval = func(*args)
    sys.stdout = backup_stdout
    return (retval, temp_screen.screen.split('\n')[:-1])

# This module redefines input to suppress the prompt string.
# Load it by using (modules "suppress_prompt") in options.ss.
old_input = input
__builtins__['input'] = lambda prompt='': old_input()

#!/bin/bash

# Create a directory for each question that has dependencies.
# The directory will contain solutions for the previous questions.

# Q1 has no dependencies.

# Create directory for Q2:
mkdir q2_directory
# Copy solutions for Q1 into Q2 directory:
cp "${testdir}/solutions/a01q1-solutions.py" "q2_directory/a01q1.py"
# If student submitted Q2 file, copy the student's Q2 file into the Q2 directory.
if [ -e "${submitdir}/a01q2.py" ]; then
  cp "${submitdir}/a01q2.py" "q2_directory/a01q2.py"
fi
# In options.rkt, use: (loadcode "q2_directory/a01q2.py")


# Create directory for Q3:
mkdir q3_directory
# Copy solutions for Q1-2 into Q3 directory:
cp "${testdir}/solutions/a01q1-solutions.py" "q3_directory/a01q1.py"
cp "${testdir}/solutions/a01q2-solutions.py" "q3_directory/a01q2.py"
# Copy the student's Q3 file into the Q3 directory
if [ -e "${submitdir}/a01q3.py" ]; then
  cp "${submitdir}/a01q3.py" "q3_directory/a01q3.py"
fi
# In options.rkt, use: (loadcode "q3_directory/a01q3.py")


# Create directory for Q4:
mkdir q4_directory
# Copy solutions for Q1-3 into Q4 directory:
cp "${testdir}/solutions/a01q1-solutions.py" "q4_directory/a01q1.py"
cp "${testdir}/solutions/a01q2-solutions.py" "q4_directory/a01q2.py"
cp "${testdir}/solutions/a01q3-solutions.py" "q4_directory/a01q3.py"
# Copy the student's Q4 file into the Q4 directory
if [ -e "${submitdir}/a01q4.py" ]; then
  cp "${submitdir}/a01q4.py" "q4_directory/a01q4.py"
fi
# In options.rkt, use: (loadcode "q4_directory/a01q4.py")


exec /u/isg/bittersuite3/computeMarks -q

(verbosity 1)
(print-submit-files false)
(print-by-question true)
(nroff-mark-scheme true)
(interpret-mark-scheme false)

(loadcode "a01q1.py")

result = MyQ1Class.message("Q1 test 01")
expected = "Q1 Correct Message: Q1 test 01"

result = MyQ1Class.message("Q1 test 02")
expected = "Q1 Correct Message: Q1 test 02"

(loadcode "q2_directory/a01q2.py")

result = join_messages("Q2 test 01")
expected = "Q1 Correct Message: Q2 test 01\n\
Q2 Correct Message: Q2 test 01"

result = join_messages("Q2 test 02")
expected = "Q1 Correct Message: Q2 test 02\n\
Q2 Correct Message: Q2 test 02"

(loadcode "q3_directory/a01q3.py")

result = join_messages("Q3 test 01")
expected = "Q1 Correct Message: Q3 test 01\n\
Q2 Correct Message: Q3 test 01\n\
Q3 Correct Message: Q3 test 01"

result = join_messages("Q3 test 02")
expected = "Q1 Correct Message: Q3 test 02\n\
Q2 Correct Message: Q3 test 02\n\
Q3 Correct Message: Q3 test 02"

(loadcode "q4_directory/a01q4.py")

result = join_messages("Q4 test 01")
expected = "Q1 Correct Message: Q4 test 01\n\
Q2 Correct Message: Q4 test 01\n\
Q3 Correct Message: Q4 test 01\n\
Q4 Correct Message: Q4 test 01"

result = join_messages("Q4 test 02")
expected = "Q1 Correct Message: Q4 test 02\n\
Q2 Correct Message: Q4 test 02\n\
Q3 Correct Message: Q4 test 02\n\
Q4 Correct Message: Q4 test 02"

(language python)
(value 1)
; Set a timeout of 5 seconds
(timeout 5)

# a01q1 model solutions
class MyQ1Class:
  def message(string):
    return "Q1 Correct Message: %s" % string

# a01q2 model solutions

from a01q1 import MyQ1Class

class MyQ2Class:
  def message(string):
    return "Q2 Correct Message: %s" % string

def join_messages(string):
    return "\n".join([MyQ1Class.message(string), MyQ2Class.message(string)])

# a01q3 model solutions

from a01q1 import MyQ1Class
from a01q2 import MyQ2Class

class MyQ3Class:
  def message(string):
    return "Q3 Correct Message: %s" % string

def join_messages(string):
    return "\n".join([MyQ1Class.message(string), MyQ2Class.message(string), MyQ3Class.message(string)])

Attachments

Topic attachments
I	Attachment	History	Action	Size	Date	Who	Comment
zip	question_dependency_example.zip	r1	manage	8.3 K	2019-11-19 - 02:28	YiLee
txt	redirect_output.py.txt	r1	manage	0.6 K	2012-09-14 - 10:52	PeterSinclair

Topic revision: r27 - 2021-06-14 - YiLee

ISG Web

ISG Web Home
- Changes
- Index
- Search

Webs
- AIMAS
- CERAS
- CF
- CrySP
- External
- Faqtest
- HCI
- Himrod
- ISG
- Main
- Multicore
- Sandbox
- TWiki
- TestNewSandbox
- TestWebS
- UW

My links
- People
- CERAS
- WatForm
- Tetherless lab
- Ubuntu Main.HowTo
- eDocs
- RGG NE notes
- RGG
- CS infrastructure
- Grad images

Edit

Instructional Support Group, David R. Cheriton School of Computer Science, University of Waterloo