In Fall 2012, CS 135 tried using a script to check the students' check-expects. The script checks if the student has thoroughly tested their code and if they have the required testcases. This page is about that script.
In the assignment marking scheme, there is typically a list of the testcases students should have. For example, if students have to write a function sum-lon to add a list of numbers, the required testcases might be:
The script works by re-defining check-expect (and check-within and check-error) using Racket's define-syntax. It then runs the student's code in a sandbox environment. After the student's code is run, you will have access to a list containing the student's check-expects.
Some information about the script's behaviour:
the script can still get at the 47, and do whatever it needs to do with it.
Attached here are two different examples for testcase files. Convention is for these files to be named questionname-tc.rkt. All these files are then put in check-testcases/aX/test-cases, where X is the assignment number (For example a09).
redux-tc.rkt super-foldr-tc.rkt
Each question in an assignment should have its own testcase file, and all subquestions should have their own "add" call designated to them (add is explained in more detail lower on the page). All question names should then be added to a file-list.txt which is in the same folder, in the order the questions appear. If an assignment has questions name1, name2, and name3, there should be a file-list with contents "name1 name2 name3".
Between Fall 2018 and Winter 2019, Paul Nijjar moved all of the common code in each test case checking file into a common tc-lib.rkt. These can be found in ~/check-testcases/aXY/f18_test_refactored, while the old ones are in the f18_test folder.
On the filesystem, the new scripts (tc-lib.rkt and the collector scripts) live in ~/check-testcases/factored/common .
The refactored scripts are stored in the ISG gitlab repo here: https://git.uwaterloo.ca/isg/racket-check-testcases .
Work for improving the shell scripts has also been done in run-student-flags.sh. For now, just stick to run-all.sh.
Between Fall 2018 and Winter 2019, ISAs and Nick Lee made adjustments to the tc-lib.rkt mentioned above to allow for a .csv to be produced that set_marks_csv can use. Since all the code is embedded in the Racket files, this feature is automatically available when run-all.sh or run-student.sh is executed. When a particular student's test cases are checked, there is additional code to write to a file called output.csv located in the same folder as the script (most likely f18_test_refactored). The only extra work is to actually run set_marks_csv using output.csv. You should check that output.csv looks correct and has nothing like "CATEGORY MISSING" anywhere.
There is no code to clear out output.csv when check-testcases is re-run multiple times. It only gets cleared if run-all.sh is run with the "replace" flag, but if it's run with "skip" (the default flag) and a different folder in the -output flag, the new rows will get appended to the old output.csv and it might get unnecessarily big. You can manually delete output.csv if you are doing a "fresh" check-testcases run and are expecting to re-run on every single student.
For each assignment, you have to edit the "Assignment Specifics" section of demo.rkt (you can rename demo.rkt if you want). Here is a description of the variables and functions you'll use. For examples, see demo.rkt.
1. Define your own "member?" in tc files, since full racket does not provide it.
(define (member? n lst)
(ormap (lambda (i) (equal? n i)) lst))
2. Check valid (listof Str) inputs:
(and ...
(list? (... fcn-app))
(andmap string? (...fcn-app))
...)
3. One way to check valid BST inputs using flatten method: (suppose the leaves are Sym)
(define (nat? n)
(and (integer? n) (>= n 0)))
(define (tree? x)
(or (symbol? x)
(and (node? x)
(node/valid? x))))
(define (node/valid? t)
(and (node? t)
(nat? (node-val t))
(tree? (node-left t))
(andmap (lambda (i) (< i (node-val t))) (flatten (node-left t)))
(tree? (node-right t))
(andmap (lambda (i) (> i (node-val t))) (flatten (node-right t)))))
(define (flatten t)
(cond [(symbol? t) empty]
[(node? t) (append (flatten (node-left t))
(list (node-val t))
(flatten (node-right t)))]))
This structure holds information about a required testcase. desc is a string which will be printed if the student is missing the testcase. has? is a function that:
Update Fall 2018: There was an issue with A08 from F17 where students were allowed to require their own files (namely, they could require "ranking.rkt" in their "match.rkt") Many students defined two constants, "students" and "employers", in both files. This broke the marking scripts because Racket complained of duplicate definitions. To get around this, we expanded the functionality of modules to allow regular filenames (as before) and filenames with particular functions/constants excluded or included. Note that you must use the 'common-collector-simple collector in collector-file below for this to work.
For example: (define modules '("a08lib.rkt" (except-in "ranking.rkt" employers students))) will require "a08lib.rkt" as usual, and require "ranking.rkt" except for the identifiers "employers" and "students".
Variable | Type | Meaning |
---|---|---|
timeout | Positive Integer | Roughly how much time the student's code is given to complete. The timeout and memory should be set very high. |
memory | Positive Integer | Roughly how much memory the student's code can use. The timeout and memory should be set very high. |
modules | List of String, or (list Sym Str Sym Sym...) | A list of teachpacks that the students can use. modules should also include any files that student can include with (require ...). |
bonuses | List of Symbol | A list of symbols, where each symbol is the name of a function for a bonus question. |
summary-line | String | summary-line will be printed right before the total number of tests that are missing. |
language-level | String | Pick the language level the assignment is in. Enter a number (argument to list-ref) to pick a language. |
get-fcns-from-eval | Syntax | List out the functions which are defined in the student's code, and which you want to use in demo.rkt. |
question-names | Hash Table mapping symbols to strings | Each key is a symbol representing the name of a function the student has to write. The value is a string, and it's the heading for that question in the output. |
short-names | Hash Table mapping symbols to strings | Each key is a symbol representing the name of a function the student has to write. The value is a string, and it's used when printing the number of testcases the student has. |
required-testcases | Association list | Each key is a symbol representing the name of a function the student has to write. The value is a list of required testcases for that funcion. |
add | Function | The add function is used to add a testcase that the students should have. add consumes a symbol (name of a function the student has to write), and a tc structure describing the required testcase. |
valid-input-checker | Hash Table mapping symbols to predicates | Each key is a symbol representing the name of a function the student has to write. The value is a function which is used to determine if a check-expect is valid. |
collector-file | String or Symbol | Which version of the collector will be used to tally test cases. |
debugging | Boolean #t or #f | Enable some additional output for a student submission |
For a real assignment, you should use a more meaningful name than demo.rkt. For example, say you name the file a10.rkt. To run a10.rkt on all the students, you can use a Bash script similar to the following:
#!/bin/bash base='/u/cs135/check-testcases/a10/' # Output will be saved here results='/u/cs135/check-testcases/a10/test-results/' mkdir ${results} cd /u/cs135/handin/a10_autotest/ for stud in *; do echo "Doing work for ${stud}" racket ${base}/a10.rkt ${stud}/skyscrapers.rkt \ 1> ${results}/${stud}_missing_testcases.txt \ 2> ${results}/${stud}_errors.txt done
This will create a lot of empty XXX_errors.txt files, where XXX is a student's Quest ID. You can remove the empty files with the Python script remove_empty_files.py attached below. This script takes one argument, which is a path to a folder, and removes all empty files (ie files with a size of 0 bytes) from the folder.
cs135@linux028:~/check-testcases$ python remove_empty_files.py /u/cs135/check-testcases/a10/test-results/ Cleaning up /u/cs135/check-testcases/a10/test-results/
Occasionally, you may get permission errors. The sandbox environment is very strict, and when the students' code is run in sandbox, their code cannot access anything unless you allow it. An example of a permission error is:
FATAL ERROR occured with file studentscode.rkt: #(struct:exn:fail file-or-directory-modify-seconds: `read' access denied for /u3/cs135/check-testcases/temp/imagedata.rkt #<continuation-mark-set>)
To give students permission to read files (such as teachpacks and provided files), add the appropriate permissions to the sandbox-path-permissions variable.
1. Pre-defined Selectors
Sometimes the selectors of a structure are already defined in full racket. In this case you need to make use of struct->vector and vector-ref to define a new function which extract the specific field of a structure.
For example, there is a structure File which has data definition
(define-struct file (name size owner))
;; A File is a (make-file Str Nat Sym)
file-size is a build in function in full racket. You can define the following function which works the same as the selector of a file.
;; my-file-size: File -> Nat
(define (my-file-size f)
(vector-ref (struct->vector f) 2))
Number 2 represents the second field of the structure.
2. Requiring Files
Sometimes students need to require some other files in their solutions. You need to include these required files in the same folder where you put [file name]-tc.rkt as well. Also, if these required files are in a teaching language, you may want to get rid of #lang racket and include the following codes at the beginning of these files:
;; The first three lines of this file were inserted by DrRacket. They record metadata
;; about the language level of this file in a form that our tools can easily process.
#reader(lib "htdp-intermediate-lambda-reader.ss" "lang")((modname [file-name]) (read-case-sensitive #t) (teachpacks ()) (htdp-settings #(#t constructor repeating-decimal #f #t none #f () #t)))
The first bolded portion is the language you want this file to use. You can set it to whichever teaching language version that is relevant to the assignment.
If students have unfilled Tests/Cases or Highlighting rubric criteria n MarkUs after you run the MarkUs script for filling in check testcases results, the may have the following problems:
- You can check whether students have syntax/run-time errors in their codes. Copy their codes in your DrRacket and see if it complains about errors. If it has any error, you can give 0 to all questions in the same file which also contains the error.
- If the file “runs” and there is a black highlighting issue in the file, it is possible that there are some run-time errors in students’ codes that are locally defined but are never tested by the students (through check-expect/within/error). You can check test-result and see if it is the case. If it is, you may have set the [filename]-tc.rkt with a non-simple collector. You should set it to simple collector if the questions don’t ask for templates. You may need to re-run autotesting on check testcases, re-make AUTOTESTING.ss, and re-auto fill marks if many students have this error.
- It is possible that all the conditions above don’t fit your current situation. In this case, it may be because you forgot to include some conditions in your codes (tc.rkt) which caused errors while running the check test case scripts like checking list? before using andmap/ormap.
- If students include some weird things in their codes and their files run (for example having incorrect structure names), you need to ask instructors what to do next.
If students have some check-expect/within with incorrect order of arguments ( (check-expect expected-value (function args)) ), usually, this should not cause problems. However, if students have check-expect like this
(check-expect some-list/structure/symbol (function args)), the script ignores these tests.
If students have test cases like the following:
(check-expect Num (function args ...)), it should still be collected by the script.
(This is extremely outdated, and not exactly how the current scripts work)
Let's trace through the demo example. There are two scripts, demo.rkt which is the main script, and collector.rkt, which is a helper script. You can rename demo.rkt if you want.
demo.rkt needs one argument: the path to the student's file. It will first create a modified copy of the student's code, and save the modified copy as .check-testcases-tmp-file.rkt. This temporary file will be saved in the same directory as the script.
The file .check-testcases-tmp-file.rkt is the same as the student's code, except:
add-a-testcase consumes a list representing a check-expect test, and adds it to the front of the list testcases. The function produces (void). For example, (add-a-testcase '(check-expect (add1 4) 5)) will add the list '(check-expect (add1 4) 5) to the front of testcases. The script collector.rkt re-defines check-expect such that the new check-expect will call add-a-testcase.
get-all-testcases produces testcases, the list of all the testcases that have been added. You can use this function to access the student's testcases. (get-all-testcases) contains the added testcases in the reverse order that they were added. For example:
> (get-all-testcases)
'()
> (add-a-testcase '(check-expect (symbol? 'one) true))
> (add-a-testcase '(check-expect (symbol? 'two) true))
> (get-all-testcases)
'((check-expect (symbol? 'two) true) (check-expect (symbol? 'one) true))
Once the modified file .check-testcases-tmp-file.rkt is created, demo.rkt runs this file using make-module-evaluator. make-module-evaluator produces an evaluator, which is a function that consumes a list representing a Racket expression, and evaluates it in the context of the student's code. For example, if the produced evaluator is called e, then (e '(+ 1 2)) will produce 3. It will evaluate (+ 1 2) using the student's code. As another example, if the student defines a function (define (f x) x), then (e '(f 2)) will produce 2. It does not matter whether f is defined in demo.rkt or not, and (e '(f 2)) will use the f defined by the student.
demo.rkt calls (e '(get-all-testcases)), which produces a list of the student's testcases in reverse order. This list is stored in the variable student-testcases-list. For the demo example, student-testcases-list looks like:
(list '(bonus-fcn 0 0)
(list 'my-equal? (iris 1 2 3 4) 'blueberry)
'(my-equal? #t #f)
'(my-equal? "a" "b")
'(my-equal? #\a #\b)
'(my-equal? sym1 symb2)
'(my-equal? (1 2 (3 4)) (#t "abcdef" #\u #t #f ok))
(list 'my-equal? (list (posn 23 23)) (list (posn 3 4)))
'(my-equal? (a b c) #\c)
(list 'my-equal? (posn 0 0) (posn 0 0))
(list 'my-equal? (posn 0 0) 42)
'(sum-lon (3.141592653589793))
'(sum-lon "not a list")
'(sum-lon (5 4))
'(sum-lon (1 2 3))
'(/ 1 0))
There is a one-to-one correspondence between student-testcases-list and the student's testcases.
demo.rkt takes student-testcases-list and filters out "bad" testcases, such as (check-expect (/ 1 0) 'inf). It also filters out tests where the inputs violate the function's contract. For example, if students have to write a function sum-lon to add a list of numbers, and they write the test
(check-expect (sum-lon "not a list") 'error)
then this test will be filtered out. Whether a testcase is filtered out or not is defined by the XXX/valid? functions where XXX is the name of a function the student has to write. The XXX/valid? functions consume a list representing the second argument to check-expect. For example, if the student writes (check-expect (sum-lon (list 1 2 3)) 6) then sum-lon/valid? will consume the list '(sum-lon (list 1 2 3)). The XXX/valid? functions should produce true if the inputs are valid, and false otherwise. For example, for sum-lon, the validity checker is
(define (sum-lon/valid? fcn-app)
(and (= 2 (length fcn-app))
(equal? (first fcn-app) 'sum-lon)
(list? (second fcn-app))
(andmap number? (second fcn-app))))
For example, (sum-lon/valid? '(sum-lon "not a list")) is false, so the test case (check-expect (sum-lon "not a list") 'error) will be filtered out and ignored.
The filtered testcases (ie the valid testcases) are stored in the hash table student-testcases. In this hash table, keys are Symbols representing the function name, and the values are lists of lists. The inner lists are list of arguments passed to the function being tested. For the demo example, the hash table would look like:
(hash 'sum-lon '(((1 2 3))
((5 4))
((3.141592653589793)))
'my-equal? (list (list (posn 0 0) 42)
(list (posn 0 0) (posn 0 0))
'((a b c) #\c)
(list (list (posn 23 23)) (list (posn 3 4)))
'((1 2 (3 4)) (#t "abcdef" #\u #t #f ok))
'(sym1 symb2)
'(#\a #\b)
'("a" "b")
'(#t #f)
(list (iris 1 2 3 4) 'blueberry))
'bonus-fcn '((0 0)))
Note that the bad testcase (check-expect (sum-lon "not a list") 'error) is not in the hash table, but it is in student-testcases-list, because the testcase has been filtered out.
Once the hash table is made, the script will check which testcases are missing. It follows this algorithm:
for each required testcase T
{
meet_testcase = false;
for each of the student's check-expect ce
{
if ce satisfies T
{
meet_testcase = true; // the student has testcase T
break;
}
}
if ( ! meet_testcase)
{
print "Test case " + T + " not met";
}
}
Required testcases are added using the add function.
After all required testcases are checked, demo.rkt will print how many testcases the student has missed in total. This total does not include missing testcases for bonus questions.
demo.rkt then prints how many distinct check-expects the student has for the function fcn using (length (remove-duplicates (hash-ref student-testcases fcn empty))).
Finally, demo.rkt prints whether the student's file is covered or not, and then quits.
I | Attachment | History | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|---|
![]() |
check-testcases-a10.zip | r2 r1 | manage | 25.8 K | 2013-01-01 - 13:39 | YiLee | CS135 Fall 2012 A10 example |
![]() |
check-testcases-demo.zip | r2 r1 | manage | 8.6 K | 2013-01-01 - 13:34 | YiLee | Demo Example |
![]() |
redux-tc.rkt | r1 | manage | 8.2 K | 2020-12-23 - 11:39 | AdamMehdi | |
![]() |
remove_empty_files.py.txt | r1 | manage | 0.3 K | 2013-01-01 - 14:58 | YiLee | A script that removes all empty files from a folder/directory. |
![]() |
super-foldr-tc.rkt | r1 | manage | 8.7 K | 2020-12-23 - 11:39 | AdamMehdi |